- Ease of performing linear algebra
Historical Alternatives: FORTRAN, MATLAB and C (also, Ada, APL, BASIC …)
- Modern implementations implement vectorized operations
- availability of legacy libraries (advantage).
- Portability issues (because of compiler extension by commercial vendors)
- Interpreted language.
- Vectorized ops
- rich run time library
- excellent docs.
- Threading is not available.
C and C++:
- Linear algebra tends to be painful
- Numpy and SciPy bring linear algebra to Python
- Indexing (similar to being matlab and FORTRAN)
numpy.linalg- common lin.al functions
- inversion, conditioning, decompositions, stats, eigenvalues
scipy.io- MATLAB interoperability. (
- numpy uses LAPACK) for calculations.
- Parallel computing – Monte-Carlo analysis.
- Python provide
- Python GUI tools are useful monitoring simulations.
- sticking points – packaging, numpy&scipy vs MATLAB (libr advantages) and familiarity.
Log Analysis with Python
Presentation by Scott McCarty (@fatherlinux).
Log analysis. Why?
- baselining, reporting, troubleshooting.
- baselining is gather stats to find out what a normal time segment looks like.
- Considerations – SEAM – Support/Evolvement/Aesthetics/Maturity
- virtualenv bootstrap scripts –
extend_parser(), * pip –
-f* pypi.python.org/pypi/modern-package-template * look at Fabric again for automating tasks. * Workit
vim openSpace lead by @mitechie
python + vim files on github: http://github.com/mitechie/pyvim
todo in emacs: column edit demo using database ids check out
window manager. It is written in Lua, and hence hackable compared to
To start quickfix window –
Use the leader character. recommended -
look up the book (apress?) ‘bash to zsh’.
plugins: * bufonly.vim * lusty juggler (get past the name!) * vcs plugin to work with git, hg, svn etc.,
by William McVey.
Hbase is a column oriented database built on top of Hadoop distributed filesystem.
Bloom filters allows you to efficiently look up sparsely populated columns.
- timestamp is added to data added into row/column.
- data can be retrieved based on time.
- cell contents can have a Time-To-Live set, to auto-expire data.
No joins.. there is no SQL. config, queries and updates are all performed via HBase API.
Schema design considerations:
- think in terms of memcached . single-key access pattern.
- remember hbase allows you to query muliple olumns for a particular index.
- sparse columns are your friends.
Stargate – A RESTful web service on HBase. Thrift – cross lang.
pip install hbase-thrift. Avro – (new) a fast
binary data serialization protocol(?). avail only in dev. snapshots.
Dumbo – map/reduce API for Python at the HBase level.
Cloudera Hadoop distribution.
protip: terminator – http://www.tenshu.net/terminator/
explore git’s make tag.
- investigate coroutines in python.
- threads are like gotos.