Friday, January 29, 2016

Dhanji R. Prasanna was a developer on Google Wave

[A]s a programmer you must have a series of wins, every single day. […] It is what makes you eager for the next feature, and the next after that. And a large team is poison to small wins. The nature of large teams is such that even when you do have wins, they come after long, tiresome and disproportionately many hurdles. And this takes all the wind out of them. Often when I shipped a feature it felt more like relief than euphoria.

Product management - PRD


Shorthand flow diagram for how the user interacts with the product : 



The Economist excerpts from 30th Jan 2016 edition (The Brawl Begins)

  • Artificial currency(Naira) control in Nigeria hurting local economy
  • Steel making industries in rich countries are suffering
  • Disruptions in art auction market - Artsy,Christie's, Sotheby's
  • Fintech disrupting insurance market
  • Walmart raising wages significantly
  • Hailstorm in USA - privacy issues - police locating people by simulating Mobile Towers
  • Advances in figuring out causes of Schizophrenia
  • Controlling miniature satellites -  Cubism
  • Zika virus and pregnancy advisory
  • Suicide jungle of Japan
  • Africa's gym craze
  • Tibet - Bottling Himalayan water could be bad for the region's environment
  • Refugees avoiding France and favoring Germany

Tuesday, January 26, 2016

NoSQL notes



Source

  • SQL intro : 
    • Declarative - SQL allows you to ask complex questions without thinking about how the data is laid out on disk, which indices to use to access the data, or what algorithms to use to process the data. A significant architectural component of most relational databases is a query optimizer.
  • Problems with SQL - 
    • Complexity leads to unpredictability. SQL's expressiveness makes it challenging to reason about the cost of each query, and thus the cost of a workload.
    • The relational data model is strict
    • If the data grows past the capacity of one server - partition/denormalize
  • Key-Data Structure Stores - Redis
  • Key-Document Stores - CouchDB/MongoDB/Riak
  • BigTable Column Family Stores - HBase/Cassandra. In this model, a key identifies a row, which contains data stored in one or more Column Families (CFs). Conceptually, one can think of Column Families as storing complex keys of the form (row ID, CF, column, timestamp), mapping to values which are sorted by their keys. This design results in data modeling decisions which push a lot of functionality into the keyspace.
  • HyperGraphDB12 and Neo4J13 are two popular NoSQL storage systems for storing graph-structured data
  • Redis is the notable exception to the no-transaction trend. On a single server, it provides a MULTI command to combine multiple operations atomically and consistently, and a WATCH command to allow isolation.
  • Single server durability - primarily by controlling fsync frequency
  • Multiple server durability - With subtle differences, Riak, Cassandra, and Voldemort allow the user to specify N, the number of machines which should ultimately have a copy of the data, and W<N, the number of machines that should confirm the data has been written before returning control to the user.
  • Sharding/Partitioning means that no one machine has to handle the write workload on the entire dataset, but no one machine can answer queries about the entire dataset.
  • Sharding adds system complexity, and where possible, you should avoid it. Try these : read replicas and caching - Facebook has Memcached installations in the range of tens of terabytes of memory!
  • Consistent Hashing Ring
  • Conflict resolution by vector clocks

Sharing an AMI with Specific AWS Accounts

Thursday, January 21, 2016

Excerpts from Making It Right: Product Management for a Startup World

Source

If you want to build a ship, don't drum people up together to collect wood, and don't assign them tasks and work. Rather teach them to long for the endless immensity of the sea. - Antoine de Saint ExupĂ©ry

Engineers don't hate process. They hate process that can't defend itself. - Michael Lopp (source)

MediaWiki's usage of DB slaves

Source

The system administrator can specify, in MediaWiki's configuration, that there is one master database server and any number of slave database servers; a weight can be assigned to each server. The load balancer will send all writes to the master, and will balance reads according to the weights. It also keeps track of the replication lag of each slave. If a slave's replication lag exceeds 30 seconds, it will not receive any read queries to allow it to catch up; if all slaves are lagged more than 30 seconds, MediaWiki will automatically put itself in read-only mode.

MediaWiki's "chronology protector" ensures that replication lag never causes a user to see a page that claims an action they've just performed hasn't happened yet: for instance, if a user renames a page, another user may still see the old name, but the one who renamed will always see the new name, because he's the one who renamed it. This is done by storing the master's position in the user's session if a request they made resulted in a write query. The next time the user makes a read request, the load balancer reads this position from the session, and tries to select a slave that has caught up to that replication position to serve the request. If none is available, it will wait until one is. It may appear to other users as though the action hasn't happened yet, but the chronology remains consistent for each user.

Wednesday, January 20, 2016

SqlAlchemy notes

Source

  1. Hybrid property/method : when invoked through an instance returns some value but for class based invocation returns sql.
  2. Alembic - for migrations
  3. Reflection - for loading table structure from DB
  4. SqlAlchemy Core - has sum/count/order by/group by etc sql equivalents.
  5. Session states for an object - Transient/Pending/Persistent/Detached
  6. session.expunge() for removing an object from session
  7. SqlAlchemy ORM is different from core
  8. (Core) ResultProxy - you can iterate through the resultset using indices or key/value both
  9. Association proxy : For e.g. many-to-many relationship in Cookie-Ingredient and you just want to create ingredients for a cookie or just want to fetch ingredients' names. With AP you can do it easily without looping through stuff.
  10. sqlacodegen - something about reflection 

Wednesday, January 6, 2016

opencv notes - circle/line detection

edge detection : Imgproc.Canny
line detection : Imgproc.HoughLinesP and Imgproc.HoughLines
circle detection : Imgproc.HoughCircles

Tuesday, January 5, 2016

OpenCV notes - Edge detection

edge detection using gaussian and-laplacian pyramids

OpenCV implements the relevant downsizing and upsizing functions as cv2.pyrDown and cv2.pyrUp.

OpenCV notes - Eulerian video magnification

A demo : http://www.extremetech.com/extreme/149623-mit-releases-open-source-software-that-reveals-invisible-motion-and-detail-in-video

Python implementation : https://github.com/brycedrennan/eulerian-magnification

Fourier transform for dummies

http://math.stackexchange.com/a/72479/13811

http://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/

OpenCV notes - Optical Flow

Optical Flow - for e.g. tracking how far an object has moved from
current position. For e.g. tracking a nod/shake of a face.

OpenCV's calcOpticalFlowPyrLK function implements the Lucas-Kanade
method of computing optical flow.

OpenCV function called goodFeaturesToTrack, selects features based on
the algorithm described below :

As the name suggests, the Good Features to Track algorithm (also known
as the Shi-Tomasi algorithm) attempts to select features that work
well with the algorithms of tracking and the use cases of tracking.

Monday, January 4, 2016

Blog Archive