Saturday, December 31, 2016

react.js with PHP, cookies/CORS/session id etc

I am working on a react.js app on my Windows machine. Dev server is running on http://localhost:3000. While connection to my backend PHP server at, I ran into multiple issues related to CORS - Cross Origin Resource Sharing

Here are some problems and solution

If you want everything to work smoothly, do this
In your PHP code, add this:

header("Access-Control-Allow-Origin: http://localhost:3000");
header("Access-Control-Allow-Credentials: true");

or in your .htaccess file add this:
Access-Control-Allow-Origin: http://localhost:3000
Access-Control-Allow-Credentials: true

and while sending requests from react.js, for which I was using fetch API, in every request add credentials: 'include'
fetch(',{credentials: 'include'}).then(function (response) {
return response.json()
}).then(function (resp) {
that.setState({blogList: resp})
Another solution is to run Chrome like this: (this will disable CORS checks)
chrome.exe --disable-web-security --user-data-dir=c:\mydata

Another error message you may get depending upon your code/config is this: (but the above solution is perfect)
A wildcard '*' cannot be used in the 'Access-Control-Allow-Origin' header when the credentials flag is true.
Another error you might encounter is this:(Even this problem will disappear if you follow the first solution)
1. PHP is sending PHPSESSID as a response cookie in your first request
2. But your next request is not using that cookie as a request cookie, or is using a different PHPSESSID

Friday, December 30, 2016

React.js Routes and Navigation Bar

Thursday, December 29, 2016

React.js - fetching html from a URL and setting it in an element

import * as React from "react";
export default class App extends React.Component {
constructor(props) {
this.state = {
data: ''

componentDidMount() {
let self = this;
.then(function (response) {
return response.text()
}).then(function (text) {
data: text

render() {
return (<div dangerouslySetInnerHTML={{__html:}}></div>

Tuesday, December 13, 2016


getInitialState() will make it available as this.state
multiple apps in one page
return inside render is same as createElement
return () pattern is required since in the parenthesis you don't have to worry about multiple lines
promise in ES6 - reject/resolve will tie to catch/then

lifecycle methods - componentDidMount componentWillMount componentDidUnMount

higher order component - not clear
propTypes - good for types/validation

performance improvement => shouldComponentUpdate/add key to siblings

Sunday, December 4, 2016

React.js fundamentals

Babel,webpack, ES6 => ES5

functional programming -> map,filter,reduce
execution context -> call,bind,apply

ES6 new features
template strings
default arguments, variable arguments
arrow functions
lexical scoping for vars but dynamic scoping for this
but lexical scoping for this with arrow functions
destructuring => arrays and objects
react export,import similar to destructuring 
ES6 classes are similar to other OOP classes, but use prototyping in background

Sunday, October 2, 2016

huffman encoding java

import java.util.Comparator;
import java.util.PriorityQueue;

public class HuffMan {
public static void main(String[] args) {
PriorityQueue<Node1> q = new PriorityQueue<Node1>(new Comparator<Node1>() {
       public int compare(Node1 n1, Node1 n2) {
        return n1.freq.compareTo(n2.freq);
q.add(new Node1('a',2));
q.add(new Node1('b',2));
q.add(new Node1('c',3));
q.add(new Node1('d',3));
q.add(new Node1('e',5));
q.add(new Node1('f',6));
q.add(new Node1('g',7));

while(q.size()>1) {
Node1 n1 = q.poll();
Node1 n2 = q.poll();
Node1 n3 = new Node1(' ',n1.freq+n2.freq);
n3.l = n1;
n3.r = n2;
private static void print(Node1 root) {
if(root.l == null && root.r == null) System.out.println(root.val+" "+root.freq+" "+root.code);
else {
if(root.l != null) print(root.l);
if(root.r != null) print(root.r);

private static void dfs(Node1 root,String prefix) {
if(root.l == null && root.r == null) root.code = prefix;
else {
if(root.l != null) dfs(root.l,prefix+"0");
if(root.r != null) dfs(root.r,prefix+"1");

class Node1 {
Node1 l;
Node1 r;
char val;
Integer freq;
String code;
public Node1(char val,int freq) {
l = r = null;
this.val = val;
this.freq = freq;

Tuesday, September 27, 2016

Segment Tree Sum with Lazy Update

public class SegmentTreeSum {
private static int[] arr = {1,9,2,3,5,8,90,1,2,88};
private static Node[] stArr;
private static int[] lazy;
private static void update(int root,int low,int high,int diff) {
if(root>=stArr.length) return;
Node n = stArr[root];

//no intersection
if(low > n.r || high < n.l) {

if(lazy[root] > 0) {
//n is contained between low & high
if(n.l >= low && n.r <= high) {
int totalDiff = (n.r - n.l + 1)*diff;
n.val += totalDiff;
int left = root*2+1;
int right = root*2+2;
if(left < stArr.length) lazy[left] = diff;
if(right < stArr.length) lazy[right] = diff;
//low <= n.r && high >= n.l
private static void nonLazyUpdate(int root, int diff) {
if(root >= stArr.length) return;
Node n = stArr[root];
if(n==null) return;
n.val += diff*(n.r - n.l + 1);
lazy[root] = 0;
int leftChild = 2*root+1;
int rightChild = 2*root+2;
if(leftChild < stArr.length && lazy[leftChild] > 0) nonLazyUpdate(leftChild, lazy[leftChild]);
if(rightChild < stArr.length && lazy[rightChild] > 0) nonLazyUpdate(rightChild, lazy[rightChild]);
nonLazyUpdate(leftChild, diff);
nonLazyUpdate(rightChild, diff);

private static void updateParents(int root, int totalDiff) {
if(root==0) return;
int parent = (root - 1)/2;
Node n = stArr[parent];
n.val += totalDiff;

private static int query(int root,int low,int high) {
if(root>=stArr.length) return 0;
Node n = stArr[root];
if(lazy[root] > 0) {nonLazyUpdate(root, lazy[root]);}
//no intersection
if(low > n.r || high < n.l) {
return 0;
//n is contained between low & high
if(n.l >= low && n.r <= high) {
return n.val;
//low <= n.r && high >= n.l
int l_sum = query(root*2+1,low,high);
int r_sum = query(root*2+2,low,high);
int sum = l_sum + r_sum;
return sum;
private static void construct(int root,int low,int high) {
if(high<low) return;
if(root>=stArr.length) return;
Node n = new Node(low,high);
stArr[root] = n;
if(low==high) { n.val = arr[low]; return;}
int mid = low + (high-low)/2;
Node left = stArr[2*root+1];
Node right = stArr[2*root+2];
int lval  = left!=null?left.val:0;
int rval  = right!=null?right.val:0;
n.val = lval + rval;
public static void main(String[] args) {
int stlen = (int) Math.ceil(Math.log(arr.length*1l)/Math.log(2l)) + 1;
stArr = new Node[(int)( Math.pow(2, stlen) - 1)];
lazy = new int[stArr.length];
construct(0,0, arr.length-1);

class NodeS {
int l;
int r;
int val;
public NodeS(int l,int r) {
this.l = l;
this.r = r;

Monday, September 26, 2016

Segment Tree For Min Query Java

public class SegmentTreeMin {
private static int[] arr = {1,9,2,3,5,8,90,1,2,88};
private static Node[] stArr;

private static int query(int root,int low,int high) {
if(root>=stArr.length) return -1;
Node n = stArr[root];

//no intesection
if(low > n.r || high < n.l) {
return -1;
//n is contained between low & high
if(n.l >= low && n.r <= high) {
return n.val;
//low <= n.r && high >= n.l
int l_ind = query(root*2+1,low,high);
int r_ind = query(root*2+2,low,high);
int min = Integer.MAX_VALUE;
int min_index = -1;
if(l_ind > -1) {min = arr[l_ind];min_index = l_ind;}
if(r_ind > -1 && arr[r_ind] < min) {min = arr[r_ind];min_index = r_ind;}
return min_index;
private static void construct(int root,int low,int high) {
if(high<low) return;
if(root>=stArr.length) return;
Node n = new Node(low,high);
stArr[root] = n;
if(low==high) { n.val = low; return;}
int mid = low + (high-low)/2;
Node left = stArr[2*root+1];
Node right = stArr[2*root+2];
int lval  = left!=null?arr[left.val]:Integer.MAX_VALUE;
int rval  = right!=null?arr[right.val]:Integer.MAX_VALUE;
if(lval<rval) n.val = left.val; else n.val = right.val;
public static void main(String[] args) {
int stlen = (int) Math.ceil(Math.log(arr.length*1l)/Math.log(2l)) + 1;
stArr = new Node[(int)( Math.pow(2, stlen) - 1)];
construct(0,0, arr.length-1);
int min_index = query(0,5,7);
int minVal = min_index > -1?arr[min_index]:-1;
/*for(int i=0;i<stArr.length;i++) {
Node n = stArr[i];
if(n!=null) {
System.out.println(i+" "+n.l+" "+n.r+" "+n.val);

class Node {
int l;
int r;
int val;
public Node(int l,int r) {
this.l = l;
this.r = r;

java heap

import java.util.Arrays;

public class HeapTest {

public static void main(String[] args) {
int[] heap = { 1, 9, 8, 2, 11, 90, 1, 34, 32, 21, 43, 34, 56, 78, 89, 87, 69 };
int heapSize = heap.length;

System.out.println("before heap creation "+Arrays.toString(heap));
System.out.println("after heap creation "+Arrays.toString(heap));
heapSort(heap, heapSize);
System.out.println("after heap sort "+Arrays.toString(heap));


private static void createHeap(int[] heap,int heapSize) {
for (int i = 1; i < heapSize; i++) {
bubble_up(heap, i);
private static void heapSort(int[] heap,int heapSize) {
while (heapSize >= 1) {
extract(heap, heapSize);
private static int extract(int[] arr, int len) {
if (len == 0)
return -1;
int max = arr[0];
swap(arr, 0, len - 1);
bubble_down(arr, 0, len - 2);
return max;

private static void swap(int[] a, int i, int j) {
int tmp = a[i];
a[i] = a[j];
a[j] = tmp;

private static void bubble_down(int[] a, int root, int last_index) {
if (last_index <= 0)
int l = 2 * root + 1;
if (l > last_index)
int r = 2 * root + 2;
int new_parent = l;
if (r <= last_index && a[r] > a[l])
new_parent = r;
if (a[root] < a[new_parent]) {
swap(a, root, new_parent);
bubble_down(a, new_parent, last_index);

private static void bubble_up(int[] heap, int last_index) {
if (last_index == 0)
int parent = (last_index - 1) / 2;
if (heap[last_index] > heap[parent]) {
swap(heap, last_index, parent);
bubble_up(heap, parent);

Wednesday, September 21, 2016

java rmq using sparse table algorithm

int[][] m = process(a);

public static int query(int i,int j,int[][] m, int[] a) {
int k = binlog(j-i+1);
return Math.min(a[m[i][k]], a[m[j+1-(1<<k)][k]]);
public static int binlog( int bits ) // returns 0 for bits=0
   int log = 0;
   if( ( bits & 0xffff0000 ) != 0 ) { bits >>>= 16; log = 16; }
   if( bits >= 256 ) { bits >>>= 8; log += 8; }
   if( bits >= 16  ) { bits >>>= 4; log += 4; }
   if( bits >= 4   ) { bits >>>= 2; log += 2; }
   return log + ( bits >>> 1 );
private static int[][] process(int[] a) {
int n = a.length;
int[][] m = new int[n][binlog(n)+1];
 //initialize M for the intervals with length 1
     for (int i = 0; i < n; i++)
         m[i][0] = i;
 //compute values from smaller to bigger intervals
     for (int j = 1; 1 << j <= n; j++)
         for (int i = 0; i + (1 << j) - 1 < n; i++)
             if (a[m[i][j - 1]] < a[m[i + (1 << (j - 1))][j - 1]])
                 m[i][j] = m[i][j - 1];
                 m[i][j] = m[i + (1 << (j - 1))][j - 1];
     return m;

java rmq using sqrt appoach - range minimum query using squre root approach

int[] a;
int bucketSize = (int) Math.ceil(Math.sqrt(a.length));

int[] m = preProcess(a,bucketSize);

               int ans = query(i,j,m,a,bucketSize);

private static int query(int i, int j, int[] m, int[] a, int bucketSize) {
int curr_min = a[i];
if(j - i +1 < 2*bucketSize) {
for(int k=i;k<=j;k++) {
curr_min = Math.min(curr_min, a[k]);
return curr_min;
int seg_start = (i + bucketSize - 1) / bucketSize;
int seg_end = (j+1)/bucketSize - 1;
curr_min = a[m[seg_start]];
for(int k=seg_start;k<=seg_end;k++)
curr_min = Math.min(curr_min,a[m[k]]);

for(int k=i;k<seg_start*bucketSize;k++)
curr_min = Math.min(curr_min,a[k]);

for(int k=j;k>=(seg_end+1)*bucketSize;k--)
curr_min = Math.min(curr_min,a[k]);

return curr_min;

private static int[] process(int[] a, int bucketSize) {
int[] m = new int[(int)Math.ceil(a.length*1.0/bucketSize*1.0)];

int count = 0;
for (int i = 0; count < a.length; i++) {
int min = a[count];
m[i] = count;
for (int j = 0; j < bucketSize && count < a.length; j++) {
if (a[count] < min) {
min = a[count];
m[i] = count;
return m;

Thursday, September 15, 2016

aws cli s3

aws configure

copy bucket to local
aws s3 sync s3://bucket-name .

copy local folder to s3 bucket
aws s3 sync folder/ s3://folder

Thursday, August 11, 2016


  1. Fire & forget, Sync, Async with callback(Future, future.get())
  2. 2 kind of errors : retriable(connection error, no leader) & non-retriable(message too large)
  3. RecordMetadata rec = producer.send(new ProducerRecord()); rec has offset of message
  4. ProducerRecord needs topic name, key serializer, value serializer, bootstrap.servers(at least 2 so that if one is down other can take over, rest will be discovered by them)


  1. One consumer group per application. If #consumers > #partitions, rest of the consumers will remain idle.
  2. One consumer per thread is the rule.
  3. How does consumer commit offset:
    1. auto commit after every 5 seconds during poll.
    2. or commit manually - consumer.commitSync()
    3. or commitAsync() with callback
    4. commitAsync() for regular commits, commitSync() in finally block
    5. commitSync and commitAsync can be called with explicit topic,partition,offset too.
  4. RebalanceListener
    1. OnPartitionsRevoked
    2. OnPartitionsAssigned
  5. SeekBeginning, SeekEnd, seek specific offset
  6. Consumer clean exit: consumer.wakeup() from shutdownhook. consumer.close(). wakeupException().
  7. Single Consumer also possible as opposed to the Consumer Group.


  1. Custom serializers
  2. Avro: Schema can be changed without affecting existing messages

  1. Default: hash
  2. Custom

Friday, July 22, 2016


Basic unit: Message(Like a row in a table). Message can have a key(metadata) associated.
Message schema: Apache Avro is most popular. JSON/XML also fine.
Topic(like a DB table)/Partition: One topic can have multiple partitions.
Consumer(Group): A consumer can select specific partition(s) to listen on.

1. There is no guarantee of time-ordering of messages across the entire topic, just within a single partition. 

2. Each partition can be hosted on a different server, which means that a single topic can be scaled horizontally across multiple servers to provide for performance.

3. The term stream is often used when discussing data within systems like Kafka. Most often, a stream is considered to be a single topic of data, regardless of the number of partitions. This represents a single stream of data moving from the producers to the consumers. This way of referring to messages is most common when discussing stream processing, which is when frameworks, some of which are Kafka Streams, Apache Samza, and Storm, operate on the messages in real time.

Broker(a single server) - receives messages from producers and services consumers.
Broker cluster - one of them is controller.

Retention of data by time/size.

Topics may also be configured as log compacted, which means that Kafka will retain only the last message produced with a specific key. This can be useful for changelog-type data, where only the last update is interesting.

MIrror Maker - for data replication across clusters.

Thursday, June 23, 2016


docker logs

docker logs -f

docker run -d -p

docker run -it

docker attach

docker port webserver 80

docker diff webserver

docker cp

docker inspect webserver

docker rm -f $(docker ps -aq)

docker search postgres
docker pull postgres:latest

Thursday, June 16, 2016


always safe : git checkout -b new_branch

git reset --cached
git rm --cached
git reset -- tracking.txt
git rm
git mv

git add -A
git add -a
git add -u

git add --update
git add --all

git init
git clone
git --version


git clone --bare
git log --abbrev-commit --abbrev=4 --pretty=oneline -10
git log --author="John Resig"
git log --pretty=oneline --since="2012-12-20" --until="2013-01-01" -5
git log --pretty=format:"%an --- %H"
Joe Doe --- 123456...

git log --pretty=oneline | wc -l
 git shortlog -s | wc -l

git log --pretty=format:%cd --date=short | uniq | wc -l

 du -h -s .git
du -h -s --exclude=.git

git alias

git stat
git info

git shortlog -n -s

    bare = true

git status -s -b

username different for every repo
global vs local config

git init

git reflog
git reset --hard HEAD@{1}
git chekcout -b new_branch

git reflog expire --all --expire=now
git prune --dry-run

2 status

git rm
git mv
git rm --cached
git log --oneline --graph --decorate --all
git show info^{tree}
git show SHA-1^{tree}
git branch -v -> latest revision in every branch
git pack-refs --all
git internal files
cat .git/refs/heads/new_one
cat .git/HEAD
cat .git/packed-refs

git branch -vv
git remote -v
ordinary local branches
local tracking
remote tracking
different pull push urls
git branch -a
git branch -r

create branch : git branch doc
switch branch : git checkout doc
show branches : git branch
git checkout -b new-branch existing-branch
git checkout -b new-branch HEAD
Show all branches : 
git branch -a -vv
git branch -r : remote
Ordinary local branches 
Local tracking branches 
Remote tracking branches 
git remote rm origin
list-remote-branches = "!listRemoteBranches() {
    git branch -r | sed \"/->/d; s/  origin\\///g\";
}; listRemoteBranches"
checkout-remote-branches = "!checkoutRemoteBranches() {
    for name in `git list-remote-branches`; do
        git checkout $name;
}; checkoutRemoteBranches"

clone-with-branches = "!cloneWithBranches() {
    git clone $1 $2;
    cd $2;
    git checkout-remote-branches;
    git remote rm origin
}; cloneWithBranches"
git remote -v
For untracked filed : 
git clean -f
git clean -n
git pack-refs
git cherry-pick <rev> -> creates a new rev
git branch -d
git branch -D
git branch --merged
git branch --no-merged
branch rename
git branch -m info information
git branch -M info information
checkout file from a different revision/branch
git checkout info -- i1.txt
git show doc:m1.txt
git clone won't preserve reflog, cp will
 git merge --ff-only feature : merge only if fast forward case
git log --oneline --merges : show only merge commits
git log --oneline --no-merges : show only no merge commits
git log --oneline --max-parents=X --min-parents=Y
 git merge --no-ff feature : force creation of merge commit even if fast forward is possible
git log --oneline --graph --all
git merge a b c d
git rebase
git rev-parse master
git checkout `git rev-parse master` : to enter detached HEAD
rebase alternative : git checkout `git rev-parse master`; git am *.patch; git checkout -B feature 
rebase can also be done with cherry-pick

Thursday, June 9, 2016


git log --graph --all --oneline --decorate


git merge --squash feature : commit all the pending commits of branch
<feature> in current branch

$ git checkout --ours [filename]
$ git checkout --theirs [filename]

git checkout --merge [filename]

git checkout --conflict=diff3 numbers.txt

git commit --no-edit

git merge --abort

echo "numbers.txt binary" > .gitattributes

Wednesday, June 8, 2016


git log --format="%h" --grep=XXX --all

git log --oneline
master feature brave-idea
^`git merge-base master feature`
^`git merge-base feature brave-idea`

Partial rebase
git rebase --onto a b c

git commit --amend

git rebase -i HEAD∼3 for Squash,reordering commits,removing commits


git merge-base feature brave-idea : common ancestor of 2 branches

git merge-base --octopus feature brave-idea master : common ancestor
of more than 2 branches

diff of branches : git log --oneline master..brave-idea

revisions included in a, b, or c and excluded from d, e, and f

git log a b c --not d --not e --not f

git log a b c ^d ^e ^f

Tuesday, May 24, 2016


git diff
git diff --cached
git l --name-only
git status -s
git reflog expire --all --expire=now
git prune --dry-run
git fsck --unreachable

non bare
clean dirty


Monday, May 2, 2016

Go race conditions

A data race occurs whenever two goroutines access the same variable concurrently and at least one of the accesses is a write.It follows from this definition that there are three ways to avoid adata race.

The first way is not to write the variable - load it beforehand and only give read access.

The second way to avoid a data race is to avoid accessing the variable from multiple goroutines - a monitor goroutine which selects on multiple channels.

Even when a variable cannot be confined to a single goroutine for its entire lifetime, confinement may still be a solution to the problem of concurrent access.For example, it's common to share a variable between goroutines in a pipeline by passing its address from one stage to the next over a channel.If each stage of the pipeline refrains from accessing the variable after sending it to the next stage, then all accesses to the variable are sequential.

Monday, April 25, 2016

installing ssl certificate/enabling https on apache,amazon linux with letsencrypt

yum install mod24_ssl openssl
cd letsencrypt/
./letsencrypt-auto --debug certonly --webroot -w /var/www/html/abc -d

Edit httpd.conf

<IfModule mod_ssl.c>
    Listen 443

<VirtualHost *:443>
        SSLEngine on
        SSLCertificateFile /etc/letsencrypt/live/
        SSLCertificateKeyFile /etc/letsencrypt/live/
        SSLCertificateChainFile /etc/letsencrypt/live/
        <Directory "/var/www/html/abc">
                Options Indexes FollowSymLinks
                AllowOverride All
                Order allow,deny
                Allow from all
        DocumentRoot "/var/www/html/abc"

service httpd restart

Monday, March 28, 2016

Installing Lua on CentOS


mkdir lua
yum install gcc gcc-c++ kernel-devel
yum install readline-devel
tar zxvf lua-5.1.4.tar.gz
cd lua-5.1.4
make linux

Tuesday, March 8, 2016

interview questions

RSS Feed reader designer
machine coding pubsub
leaderboard design game click

Saturday, March 5, 2016

interview questions

all elements < k =k >k
cron scheduler design

job submitter sync/async

mobile next word suggestion
xml parser system design
deleteMin,deleteMax,findMin,findMax,insert,delete - optimize for them - BST + heap

LRU -> doubly linked list + hashmap
count of numbers with consecutive zeroes.
binary search(number repeated, find the least index)

Tuesday, March 1, 2016

nginx notes

  1. When the first version of nginx was released, it was meant to be deployed alongside Apache such that static content like HTML, CSS, JavaScript and images were handled by nginx to offload concurrency and latency processing from Apache-based application servers.
  2. While nginx works in a Windows environment, the Windows version of nginx is more like a proof-of-concept rather than a fully functional port. 
  3. nginx doesn't support dynamically loaded modules; i.e., modules are compiled along with the core at build stage.
  4. nginx conserves CPU cycles as well because there's no ongoing create-destroy pattern for processes or threads.
  5. nginx does not currently support Apache-style distributed configurations (i.e., .htaccess files). All of the configuration relevant to nginx web server behavior should reside in a centralized set of configuration files.

Monday, February 15, 2016

Ninja notes

What NInja Does
The build file, once parsed, describes a graph of dependencies: the final output binary depends on linking a number of objects, each of which is the result of compiling sources. Specifically it is a bipartite graph, where "nodes" (input files) point to "edges" (build commands) which point to nodes (output files)6. The build process then traverses this graph.

Chrome(roughly 40,000 files) is roughly the size of the Linux kernel so the approach taken there ought to work.

Supporting Windows
  1. straightforward changes like making the path separator a backslash or changing the Ninja syntax to allow colons in a path (like c:\foo.txt)
  2. Windows has a relatively low limit on the length of a command, an issue that comes up when constructing the command passed to a final link step that may mention most files in the project.
  3. Getting the modification time of a file–GetFileAttributesEx() on Windows13 and stat() on non-Windows platforms–seems to be about 100 times slower on Windows than it is on Linux.14
  4. The Git version control system, which similarly needs to get the state of many files, can use multiple threads on Windows to execute file checks in parallel. Ninja ought to adopt this feature.

Executing a Build
Ninja runs build commands in parallel by default, based on the number of available CPUs on the system.

The Build Log
  1. Linux kernel build system tracks the commands used to generate outputs.
  2. Consider a motivating example: you compile an input foo.c into an output foo.o, and then change the build file such that it should be rebuilt with different compilation flags.
  3. For the build system to know that it needs to rebuild the output, it must either note that foo.o depends on the build files themselves (which, depending on the organization of the project, might mean that a change to the build files would cause the entire project to rebuild), or record the commands used to generate each output and compare them for each build.
  4. Ninja writes out a build log that records the full commands used to generate each output.
  5. Then for each subsequent build, Ninja loads the previous build log and compares the new build's commands
  6. Rather than recording commands, which are frequently very long and take a lot of time to parse, Ninja instead records a hash of the command.
  7. Using hashes reduced the size of the build log dramatically–from 200 MB to less than 2 MB on Mac OS X–and made it over 20 times faster to load.

Dependency Files
A better approach relies on the fact that at compile time gcc (and Microsoft's Visual Studio) can output which headers were used to build the output.

  1. Ninja avoids using strings to identify paths. Instead, Ninja maps each path it encounters to a unique Node object and the Node object is used in the rest of the code. Reusing this object ensures that a given path is only ever checked on disk once, and the result of that check (i.e., the modification time) can be reused in other code.
  2. To always use the same Node for a single file, Ninja must reliably map every possible name for a file into the same Node object. This requires a canonicalization pass on all paths mentioned in input files,

Saturday, February 13, 2016

The Economist Excerpts

Cannabis legalization :
  1. Libertarians may ask why cannabis, which has no known lethal dose, should be regulated at all for adults who can make free, informed decisions. 
  2. There are two reasons for care. First, cannabis appears to induce dependency in a minority of users, meaning the decision whether to light up is not a free one. 
  3. Second, cannabis's illegality means that the research on its long-term effects is hazy, so even the most informed decision is based on incomplete information. 
  4. When decisions are neither always free nor fully informed, the state is justified in steering consumers away, as it does from alcohol and tobacco.
  5. One model is the United States after Prohibition: alcohol taxes were set low at first, to drive out the bootleggers; later, with the Mafia gone, they were ramped up.
  6. Likewise, alluring packaging and products, such as cannabis sweets that would appeal to children, should be outlawed, just as many countries outlaw flavoured cigarettes and alcohol-spiked sweets.
  7. Advertising should be banned.
  8. A similar trade-off applies when determining what products to allow. Cannabis no longer means just joints. Legal entrepreneurs have cooked up pot-laced food and drink, reaching customers who might have avoided smoking the stuff. Ultra-strong "concentrates" are on offer to be inhaled or swallowed. Edibles and stronger strains help put the illegal dealers out of business, but they also risk encouraging more people to take the drug, and in stronger forms. The starting-point should be to legalise only what is already available on the black market. That would mean capping or taxing potency, much as spirits are taxed more steeply and are less available than beer. Again, the mix will vary. Europe may be able to ban concentrates. America already has a taste for them. If the product were outlawed there the mob would gladly step in.

Sunday, February 7, 2016

downloading usb oem drivers for android devices

The Economist excerpts

  1. Managing migrant crisis in EU : There is an encouraging precedent, too. When more than 1m "boat people" fled Vietnam after the communists took over in 1975, they went initially to refugee camps in Hong Kong and other parts of Asia before being sent to America, Europe, Australia and wherever else would take them. They arrived with nothing but adapted astonishingly fast: the median household income for Vietnamese-Americans, for example, is now above the national average. No one in America now frets that the boat people will not fit in.
  2. Turkey - Mr Erdogan and his AK party are tightening his grip on the country by controlling Media, Judiciary, Schools etc. though Turkey's economy is improving under him. Democracy is like a train, he said once; you get off once you have reached your destination.
  3. Turkey's urban growth - Istanbul is ready for its third airport. Urban growth in Turkey is as fast as India/China yet as good as Europe with even slums having tidy pavements and sanitation.
  4. Turkey has been reasonably free of radical Islam. Few years ago, it set out on a mission to resolve problems with its neighbors. Has also done great by taking in 2 mil Syrian refugees.
  5. Should HSBC move from London to Hong Kong? - HSBC matters. Regulators judge it to be the world's most important bank, alongside JPMorgan Chase. A tenth of global trade passes through its systems and it has deep links with Asia. (Simon Robertson, a director of the bank, is also on the board of The Economist Group.) Its record has blemishes—most notably, weak money-laundering controls in Mexico. But it has never been bailed out; indeed, it supplied liquidity to the financial system in 2008-09. It is organised in self-reliant silos, a structure regulators now say is best practice.
  6. Singapore - reserving minimum number of seats for opposition.
  7. Cut flower exports by Kenya are increasing though way behind Netherlands.
  8. Preclinical Reproducibility and Robustness
  9. Tech and startups - Klarna, Voice-powered medical devices, Doping in racehorses, Organ preservation, Guinea-worm disease, 

Monday, February 1, 2016

ZeroMQ - batching

ØMQ tries to deliver consistently low latencies combined with high throughput using the following strategy: when message flow is sparse and doesn't exceed the network stack's bandwidth, ØMQ turns all the batching off to improve latency. The trade-off here is somewhat higher CPU usage—we still have to traverse the stack frequently. However, that isn't considered to be a problem in most cases.

When the message rate exceeds the bandwidth of the network stack, in such situations it makes sense to start batching aggressively. There's nothing to lose as the latencies are already high anyway. On the other hand, aggressive batching improves throughput and can empty the queue of pending messages—which in turn means the latency will gradually drop as the queueing delay decreases. Once there are no outstanding messages in the queue, the batching can be turned off to improve the latency even further.

One additional observation is that the batching should only be done on the topmost level. If the messages are batched there, the lower layers have nothing to batch anyway, and so all the batching algorithms underneath do nothing except introduce additional latency.

ZeroMQ - latency vs throughput

To make a long story short, latency and throughput are two different metrics; that much is obvious. The important thing is to understand the difference between the two and their mutual relationship. Latency can be measured only between two different points in the system; There's no such thing as latency at point A. Each message has its own latency. You can average the latencies of multiple messages; however, there's no such thing as latency of a stream of messages.

Throughput, on the other hand, can be measured only at a single point of the system. There's a throughput at the sender, there's a throughput at the receiver, there's a throughput at any intermediate point between the two, but there's no such thing as overall throughput of the whole system. And throughput make sense only for a set of messages; there's no such thing as throughput of a single message.

ZeroMQ - lock free algorithms

Lock-free algorithms:
  • Lock-free algorithms have been in vogue lately. They are simple mechanisms for inter-thread communication that don't rely on the kernel-provided synchronisation primitives, such as mutexes or semaphores; rather, they do the synchronisation using atomic CPU operations, such as atomic compare-and-swap (CAS). It should be understood that they are not literally lock-free—instead, locking is done behind the scenes on the hardware level.
  • Each queue has exactly one writer thread and exactly one reader thread. If there's a need for 1-to-N communication, multiple queues are created (Figure 24.8). Given that this way the queue doesn't have to take care of synchronising the writers (there's only one writer) or readers (there's only one reader) it can be implemented in an extra-efficient way.
  • While lock-free algorithms were more efficient than classic mutex-based algorithms, atomic CPU operations are still rather expensive (especially when there's contention between CPU cores) and doing an atomic operation for each message written and/or each message read was slower than we were willing to accept.
  • Receiving a packet is an atomic event; you cannot get half of it. This atomic event results in the need to write 10 messages to the lock-free queue. There's not much point in doing an atomic operation for each message. Instead, you can accumulate the messages in a "pre-write" portion of the queue that's accessed solely by the writer thread, and then flush it using a single atomic operation.

Friday, January 29, 2016

Dhanji R. Prasanna was a developer on Google Wave

[A]s a programmer you must have a series of wins, every single day. […] It is what makes you eager for the next feature, and the next after that. And a large team is poison to small wins. The nature of large teams is such that even when you do have wins, they come after long, tiresome and disproportionately many hurdles. And this takes all the wind out of them. Often when I shipped a feature it felt more like relief than euphoria.

Product management - PRD

Shorthand flow diagram for how the user interacts with the product : 

The Economist excerpts from 30th Jan 2016 edition (The Brawl Begins)

  • Artificial currency(Naira) control in Nigeria hurting local economy
  • Steel making industries in rich countries are suffering
  • Disruptions in art auction market - Artsy,Christie's, Sotheby's
  • Fintech disrupting insurance market
  • Walmart raising wages significantly
  • Hailstorm in USA - privacy issues - police locating people by simulating Mobile Towers
  • Advances in figuring out causes of Schizophrenia
  • Controlling miniature satellites -  Cubism
  • Zika virus and pregnancy advisory
  • Suicide jungle of Japan
  • Africa's gym craze
  • Tibet - Bottling Himalayan water could be bad for the region's environment
  • Refugees avoiding France and favoring Germany

Tuesday, January 26, 2016

NoSQL notes


  • SQL intro : 
    • Declarative - SQL allows you to ask complex questions without thinking about how the data is laid out on disk, which indices to use to access the data, or what algorithms to use to process the data. A significant architectural component of most relational databases is a query optimizer.
  • Problems with SQL - 
    • Complexity leads to unpredictability. SQL's expressiveness makes it challenging to reason about the cost of each query, and thus the cost of a workload.
    • The relational data model is strict
    • If the data grows past the capacity of one server - partition/denormalize
  • Key-Data Structure Stores - Redis
  • Key-Document Stores - CouchDB/MongoDB/Riak
  • BigTable Column Family Stores - HBase/Cassandra. In this model, a key identifies a row, which contains data stored in one or more Column Families (CFs). Conceptually, one can think of Column Families as storing complex keys of the form (row ID, CF, column, timestamp), mapping to values which are sorted by their keys. This design results in data modeling decisions which push a lot of functionality into the keyspace.
  • HyperGraphDB12 and Neo4J13 are two popular NoSQL storage systems for storing graph-structured data
  • Redis is the notable exception to the no-transaction trend. On a single server, it provides a MULTI command to combine multiple operations atomically and consistently, and a WATCH command to allow isolation.
  • Single server durability - primarily by controlling fsync frequency
  • Multiple server durability - With subtle differences, Riak, Cassandra, and Voldemort allow the user to specify N, the number of machines which should ultimately have a copy of the data, and W<N, the number of machines that should confirm the data has been written before returning control to the user.
  • Sharding/Partitioning means that no one machine has to handle the write workload on the entire dataset, but no one machine can answer queries about the entire dataset.
  • Sharding adds system complexity, and where possible, you should avoid it. Try these : read replicas and caching - Facebook has Memcached installations in the range of tens of terabytes of memory!
  • Consistent Hashing Ring
  • Conflict resolution by vector clocks

Sharing an AMI with Specific AWS Accounts

Thursday, January 21, 2016

Excerpts from Making It Right: Product Management for a Startup World


If you want to build a ship, don't drum people up together to collect wood, and don't assign them tasks and work. Rather teach them to long for the endless immensity of the sea. - Antoine de Saint Exupéry

Engineers don't hate process. They hate process that can't defend itself. - Michael Lopp (source)

MediaWiki's usage of DB slaves


The system administrator can specify, in MediaWiki's configuration, that there is one master database server and any number of slave database servers; a weight can be assigned to each server. The load balancer will send all writes to the master, and will balance reads according to the weights. It also keeps track of the replication lag of each slave. If a slave's replication lag exceeds 30 seconds, it will not receive any read queries to allow it to catch up; if all slaves are lagged more than 30 seconds, MediaWiki will automatically put itself in read-only mode.

MediaWiki's "chronology protector" ensures that replication lag never causes a user to see a page that claims an action they've just performed hasn't happened yet: for instance, if a user renames a page, another user may still see the old name, but the one who renamed will always see the new name, because he's the one who renamed it. This is done by storing the master's position in the user's session if a request they made resulted in a write query. The next time the user makes a read request, the load balancer reads this position from the session, and tries to select a slave that has caught up to that replication position to serve the request. If none is available, it will wait until one is. It may appear to other users as though the action hasn't happened yet, but the chronology remains consistent for each user.

Wednesday, January 20, 2016

SqlAlchemy notes


  1. Hybrid property/method : when invoked through an instance returns some value but for class based invocation returns sql.
  2. Alembic - for migrations
  3. Reflection - for loading table structure from DB
  4. SqlAlchemy Core - has sum/count/order by/group by etc sql equivalents.
  5. Session states for an object - Transient/Pending/Persistent/Detached
  6. session.expunge() for removing an object from session
  7. SqlAlchemy ORM is different from core
  8. (Core) ResultProxy - you can iterate through the resultset using indices or key/value both
  9. Association proxy : For e.g. many-to-many relationship in Cookie-Ingredient and you just want to create ingredients for a cookie or just want to fetch ingredients' names. With AP you can do it easily without looping through stuff.
  10. sqlacodegen - something about reflection 

Wednesday, January 6, 2016

opencv notes - circle/line detection

edge detection : Imgproc.Canny
line detection : Imgproc.HoughLinesP and Imgproc.HoughLines
circle detection : Imgproc.HoughCircles

Tuesday, January 5, 2016

OpenCV notes - Edge detection

edge detection using gaussian and-laplacian pyramids

OpenCV implements the relevant downsizing and upsizing functions as cv2.pyrDown and cv2.pyrUp.

OpenCV notes - Eulerian video magnification

A demo :

Python implementation :

Fourier transform for dummies

OpenCV notes - Optical Flow

Optical Flow - for e.g. tracking how far an object has moved from
current position. For e.g. tracking a nod/shake of a face.

OpenCV's calcOpticalFlowPyrLK function implements the Lucas-Kanade
method of computing optical flow.

OpenCV function called goodFeaturesToTrack, selects features based on
the algorithm described below :

As the name suggests, the Good Features to Track algorithm (also known
as the Shi-Tomasi algorithm) attempts to select features that work
well with the algorithms of tracking and the use cases of tracking.

Blog Archive