Software Troubles and Troubleshooting: February 2016

Monday, February 15, 2016

Ninja notes

http://aosabook.org/en/posa/ninja.html

What NInja Does
The build file, once parsed, describes a graph of dependencies: the final output binary depends on linking a number of objects, each of which is the result of compiling sources. Specifically it is a bipartite graph, where "nodes" (input files) point to "edges" (build commands) which point to nodes (output files)6. The build process then traverses this graph.

Chrome(roughly 40,000 files) is roughly the size of the Linux kernel so the approach taken there ought to work.

Supporting Windows

straightforward changes like making the path separator a backslash or changing the Ninja syntax to allow colons in a path (like c:\foo.txt)
Windows has a relatively low limit on the length of a command, an issue that comes up when constructing the command passed to a final link step that may mention most files in the project.
Getting the modification time of a file–GetFileAttributesEx() on Windows13 and stat() on non-Windows platforms–seems to be about 100 times slower on Windows than it is on Linux.14
The Git version control system, which similarly needs to get the state of many files, can use multiple threads on Windows to execute file checks in parallel. Ninja ought to adopt this feature.

Executing a Build
Ninja runs build commands in parallel by default, based on the number of available CPUs on the system.

The Build Log

Linux kernel build system tracks the commands used to generate outputs.
Consider a motivating example: you compile an input foo.c into an output foo.o, and then change the build file such that it should be rebuilt with different compilation flags.
For the build system to know that it needs to rebuild the output, it must either note that foo.o depends on the build files themselves (which, depending on the organization of the project, might mean that a change to the build files would cause the entire project to rebuild), or record the commands used to generate each output and compare them for each build.
Ninja writes out a build log that records the full commands used to generate each output.
Then for each subsequent build, Ninja loads the previous build log and compares the new build's commands
Rather than recording commands, which are frequently very long and take a lot of time to parse, Ninja instead records a hash of the command.
Using hashes reduced the size of the build log dramatically–from 200 MB to less than 2 MB on Mac OS X–and made it over 20 times faster to load.

Dependency Files
A better approach relies on the fact that at compile time gcc (and Microsoft's Visual Studio) can output which headers were used to build the output.

Canonicalization

Ninja avoids using strings to identify paths. Instead, Ninja maps each path it encounters to a unique Node object and the Node object is used in the rest of the code. Reusing this object ensures that a given path is only ever checked on disk once, and the result of that check (i.e., the modification time) can be reused in other code.
To always use the same Node for a single file, Ninja must reliably map every possible name for a file into the same Node object. This requires a canonicalization pass on all paths mentioned in input files,

Saturday, February 13, 2016

The Economist Excerpts

Cannabis legalization :

Libertarians may ask why cannabis, which has no known lethal dose, should be regulated at all for adults who can make free, informed decisions.
There are two reasons for care. First, cannabis appears to induce dependency in a minority of users, meaning the decision whether to light up is not a free one.
Second, cannabis's illegality means that the research on its long-term effects is hazy, so even the most informed decision is based on incomplete information.
When decisions are neither always free nor fully informed, the state is justified in steering consumers away, as it does from alcohol and tobacco.
One model is the United States after Prohibition: alcohol taxes were set low at first, to drive out the bootleggers; later, with the Mafia gone, they were ramped up.
Likewise, alluring packaging and products, such as cannabis sweets that would appeal to children, should be outlawed, just as many countries outlaw flavoured cigarettes and alcohol-spiked sweets.
Advertising should be banned.
A similar trade-off applies when determining what products to allow. Cannabis no longer means just joints. Legal entrepreneurs have cooked up pot-laced food and drink, reaching customers who might have avoided smoking the stuff. Ultra-strong "concentrates" are on offer to be inhaled or swallowed. Edibles and stronger strains help put the illegal dealers out of business, but they also risk encouraging more people to take the drug, and in stronger forms. The starting-point should be to legalise only what is already available on the black market. That would mean capping or taxing potency, much as spirits are taxed more steeply and are less available than beer. Again, the mix will vary. Europe may be able to ban concentrates. America already has a taste for them. If the product were outlawed there the mob would gladly step in.

Sunday, February 7, 2016

downloading usb oem drivers for android devices

http://developer.android.com/tools/extras/oem-usb.html

Setting up the device : http://developer.android.com/tools/device.html

The Economist excerpts

Managing migrant crisis in EU : There is an encouraging precedent, too. When more than 1m "boat people" fled Vietnam after the communists took over in 1975, they went initially to refugee camps in Hong Kong and other parts of Asia before being sent to America, Europe, Australia and wherever else would take them. They arrived with nothing but adapted astonishingly fast: the median household income for Vietnamese-Americans, for example, is now above the national average. No one in America now frets that the boat people will not fit in.
Turkey - Mr Erdogan and his AK party are tightening his grip on the country by controlling Media, Judiciary, Schools etc. though Turkey's economy is improving under him. Democracy is like a train, he said once; you get off once you have reached your destination.
Turkey's urban growth - Istanbul is ready for its third airport. Urban growth in Turkey is as fast as India/China yet as good as Europe with even slums having tidy pavements and sanitation.
Turkey has been reasonably free of radical Islam. Few years ago, it set out on a mission to resolve problems with its neighbors. Has also done great by taking in 2 mil Syrian refugees.
Should HSBC move from London to Hong Kong? - HSBC matters. Regulators judge it to be the world's most important bank, alongside JPMorgan Chase. A tenth of global trade passes through its systems and it has deep links with Asia. (Simon Robertson, a director of the bank, is also on the board of The Economist Group.) Its record has blemishes—most notably, weak money-laundering controls in Mexico. But it has never been bailed out; indeed, it supplied liquidity to the financial system in 2008-09. It is organised in self-reliant silos, a structure regulators now say is best practice.
Singapore - reserving minimum number of seats for opposition.
Cut flower exports by Kenya are increasing though way behind Netherlands.
Preclinical Reproducibility and Robustness
Tech and startups - Klarna, Voice-powered medical devices, Doping in racehorses, Organ preservation, Guinea-worm disease,

Monday, February 1, 2016

ZeroMQ - batching

Source

ØMQ tries to deliver consistently low latencies combined with high throughput using the following strategy: when message flow is sparse and doesn't exceed the network stack's bandwidth, ØMQ turns all the batching off to improve latency. The trade-off here is somewhat higher CPU usage—we still have to traverse the stack frequently. However, that isn't considered to be a problem in most cases.

When the message rate exceeds the bandwidth of the network stack, in such situations it makes sense to start batching aggressively. There's nothing to lose as the latencies are already high anyway. On the other hand, aggressive batching improves throughput and can empty the queue of pending messages—which in turn means the latency will gradually drop as the queueing delay decreases. Once there are no outstanding messages in the queue, the batching can be turned off to improve the latency even further.

One additional observation is that the batching should only be done on the topmost level. If the messages are batched there, the lower layers have nothing to batch anyway, and so all the batching algorithms underneath do nothing except introduce additional latency.

ZeroMQ - latency vs throughput

Source

To make a long story short, latency and throughput are two different metrics; that much is obvious. The important thing is to understand the difference between the two and their mutual relationship. Latency can be measured only between two different points in the system; There's no such thing as latency at point A. Each message has its own latency. You can average the latencies of multiple messages; however, there's no such thing as latency of a stream of messages.

Throughput, on the other hand, can be measured only at a single point of the system. There's a throughput at the sender, there's a throughput at the receiver, there's a throughput at any intermediate point between the two, but there's no such thing as overall throughput of the whole system. And throughput make sense only for a set of messages; there's no such thing as throughput of a single message.

ZeroMQ - lock free algorithms

Source

Lock-free algorithms:

Lock-free algorithms have been in vogue lately. They are simple mechanisms for inter-thread communication that don't rely on the kernel-provided synchronisation primitives, such as mutexes or semaphores; rather, they do the synchronisation using atomic CPU operations, such as atomic compare-and-swap (CAS). It should be understood that they are not literally lock-free—instead, locking is done behind the scenes on the hardware level.
Each queue has exactly one writer thread and exactly one reader thread. If there's a need for 1-to-N communication, multiple queues are created (Figure 24.8). Given that this way the queue doesn't have to take care of synchronising the writers (there's only one writer) or readers (there's only one reader) it can be implemented in an extra-efficient way.
While lock-free algorithms were more efficient than classic mutex-based algorithms, atomic CPU operations are still rather expensive (especially when there's contention between CPU cores) and doing an atomic operation for each message written and/or each message read was slower than we were willing to accept.
Receiving a packet is an atomic event; you cannot get half of it. This atomic event results in the need to write 10 messages to the lock-free queue. There's not much point in doing an atomic operation for each message. Instead, you can accumulate the messages in a "pre-write" portion of the queue that's accessed solely by the writer thread, and then flush it using a single atomic operation.

Monday, February 15, 2016

Saturday, February 13, 2016

Sunday, February 7, 2016

Monday, February 1, 2016

Blog Archive