tag:blogger.com,1999:blog-54340319191104511402024-03-18T23:43:08.739+05:30Software Troubles and Troubleshootingoneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.comBlogger728125tag:blogger.com,1999:blog-5434031919110451140.post-14258968808579741542024-03-18T15:45:00.005+05:302024-03-18T23:42:19.078+05:30Mastering Langchain course by Sharath Raju<p><span style="font-size: x-large;">https://learning.oreilly.com/course/langchain-masterclass/9781835464427/<br /><br />https://github.com/PacktPublishing/LangChain-MasterClass---Build-15-OpenAI-and-LLAMA-2-LLM-Apps-using-Python/tree/main<br /><br />Topics covered:<br /><br />Streamlit:<br />For quick web UI prototyping<br /><br />OpenAI LLM, audio transcription.<br />HuggingFace hosted models<br /><br />Google Colab - for using huggingface transformers which download the model locally.<br /><br />Running Llama 2 locally via CTransformers or via Replicate API<br /><br />DuckduckGo search<br /><br />Prompt Templates:<br />FewShotPromptTemplate, Length based example selector<br /><br /><br />Splitters:<br />RecursiveCharacterTextSplitter<br />CharacterTextSplitter</span></p><p><span style="font-size: x-large;"><br />Embeddings:<br />SentenceTransformerEmbeddings<br />OpenAIEmbeddings<br />similarity_search<br /><br />Response formatter/output parser:<br />csv<br />json<br /><br />Vector DBs:<br />Pinecone - hosted<br />Chroma - local<br />FAISS(not typically used as vector db, but for similarity search etc)<br /><br />Memory:<br />ConversationSummaryMemory<br />TokenBuffer<br />WindowBuffer<br />ConversationBuffer<br /><br />Agents:<br />Pandas dataframe agent - to run some python code to analyze csv data<br /><br />Chains:<br />Utility Chains:<br />Load_Summarize_Chain - load and summarize docs<br />LLMRequestsChain - to fetch a url<br /><br /><br />Others:<br />SequentialChain<br />chain_type=stuff/map_reduce<br /><br /></span></p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-3679409383800911562024-02-08T17:07:00.005+05:302024-02-08T17:07:53.938+05:30Building Microservices by Sam Newman<p><span style="background-color: white; color: #3d3b49; font-family: "Noto serif", serif;"><span style="font-size: x-large;">I wrote the bulk of this book throughout the whole of 2020 and in the first half of 2021. I used Visual Studio Code on macOS for most of the writing, although on rare occasions I also used Working Copy on iOS. OmniGraffle was used to create all the diagrams in the book. AsciiDoc was used to format the book and on the whole was excellent, with O’Reilly’s Atlas toolchain doing the magic of making a book appear on the other side.</span></span></p><p><span style="background-color: white; color: #3d3b49; font-family: "Noto serif", serif;"><span style="font-size: x-large;">-------------------------------------------</span></span></p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-5898914003228660502022-09-01T22:04:00.002+05:302022-09-01T22:04:36.658+05:30Vim insert comma at multiple positions of each line<p> <span style="font-family: Menlo; font-size: 22px; font-variant-ligatures: no-common-ligatures;">::%s/\%12c\|\%20c\|\%30c/,/g<br />Insert comma at 12,20,30 positions of each line.</span></p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-74598513181179145742021-10-28T15:52:00.000+05:302021-10-28T15:52:49.769+05:30Java 8 in action<p> Functional interfaces: Interfaces which define exactly one abstract method. They can be used by Lambdas to instantiate an object of that type with the abstract method implemented.<br /><br />For e.g. </p><p>Runnable r1 = () -> {System.out.println("Hello world 1");}<br /><br />process(r1); //prints Hello world 1<br /><br />Runnable is an interface with exactly one abstract method named "run";<br /><br />OR<br /><br />process(() -> System.out.println("This is awesome!!"));</p><p>------------------------------------------------------------</p><p><br /></p><p><br /></p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-41956445249169589432021-03-15T17:59:00.003+05:302021-03-16T19:05:39.580+05:30Using asyncio in python - book notes<p> https://learning.oreilly.com/library/view/using-asyncio-in/9781492075325/ch01.html<br /><br />Chapter 1:<br />Threading—as a programming model—is best suited to certain kinds of computational tasks that are best executed with multiple CPUs and shared memory for efficient communication between the threads. In such tasks, the use of multicore processing with shared memory is a necessary evil because the problem domain requires it. For e.g. matrix multiplication using multiple cores with shared memory as done by numpy.<br /><br />But Network programming is not one of those domains. The key insight is that network programming involves a great deal of “waiting for things to happen,” and because of this, we don’t need the operating system to efficiently distribute our tasks over multiple CPUs. Furthermore, we don’t need the risks that preemptive multitasking brings, such as race conditions when working with shared memory. This is where AsyncIO helps. Typically OS has limits on number of threads. But since AsyncIO is a single threaded model, it offers a simple way to support many thousands of simultaneous socket connections, including being able to handle many long-lived connections for newer technologies like WebSockets, or MQTT.<br /><br />Chapter 2:<br />Drawbacks of threads:<br />1. Resource-intensive: Threads require extra operating system resources to create, such as preallocated, per-thread stack space that consumes process virtual memory up front.<br /><br />2. Threading can affect throughput - At very high concurrency levels (say, >5,000 threads), there can also be an impact on throughput due to context-switching costs, assuming you can figure out how to configure your operating system to even allow you to create that many threads!<br /><br />3. Threading is inflexible - The operating system will continually share CPU time with all threads regardless of whether a thread is ready to do work or not.<br /><br /><a href="https://github.com/dharm0us/python_threading_cutlery_threadbots">https://github.com/dharm0us/python_threading_cutlery_threadbots</a><br /><br />Chapter 3:<br />Asyncio provides another tool for concurrent programming in Python, that is more lightweight than threads or multiprocessing. In a very simple sense it does this by having an event loop execute a collection of tasks, with a key difference being that each task chooses when to yield control back to the event loop.</p><p>Quickstart:<br />You only need to know about seven functions to use Asyncio [for everyday use].<br />They cover:<br />1. Starting the asyncio event loop <br />2. Calling async/await functions <br />3. Creating a task to be run on the loop <br />4. Waiting for multiple tasks to complete <br />5. Closing the loop after all concurrent tasks have completed<br /><br />asyncio provides both a loop specification, <b>AbstractEventLoop</b>, and an implementation, <b>BaseEventLoop</b>. The clear separation between specification and implementation makes it possible for third-party developers to make alternative implementations of the event loop, and this has already happened with the uvloop project, which provides a much faster loop implementation than the one in the asyncio standard library module.<br /><br /><b>Task</b> is a subclass of <b>Future</b>, but they could easily be considered to be in the same tier. A Future instance represents some sort of ongoing action that will return a result via notification on the event loop, while a Task represents a coroutine running on the event loop. The short version is: a future is “loop-aware,” while a task is both “loop-aware” and “coroutine-aware.” As an end-user developer, you will be working with tasks much more than futures, but for a framework designer, the proportion might be the other way around, depending on the details.<br /><br /><b>Coroutines internals</b></p><p>>>> async def f():</p><p>... return 123</p><p>... </p><p>>>> type(f)</p><p><class 'function'></p><p>>>> import inspect</p><p>>>> inspect.iscoroutinefunction(f)</p><p>True</p><p>>>> coro = f()</p><p>>>> type(coro)</p><p><class 'coroutine'></p><p><br />Coroutine internals: using send() and StopIteration </p><p>>>> async def f():<br />... return 123 <br />>>> coro = f()<br /> >>> try:<br /> ... coro.send(None)<br />... except StopIteration as e:<br /> ... print('The answer was:', e.value)<br />... The answer was: 123 <br /><br />A coroutine is initiated by “sending” it a None. Internally, this is what the event loop is going to be doing to your precious coroutines; you will never have to do this manually. All the coroutines you make will be executed either with loop.create_task(coro) or await coro. It’s the loop that does the .send(None) behind the scenes. <br />When the coroutine returns, a special kind of exception is raised, called StopIteration. Note that we can access the return value of the coroutine via the value attribute of the exception itself. Again, you don’t need to know that it works like this: from your point of view, async def functions will simply return a value with the return statement, just like normal functions.</p><p>These two points, the <b>send()</b> and the <b>StopIteration</b>, define the start and end of the executing coroutine, respectively.<br /><br />The <b>New await Keyword</b> This new keyword await always takes a parameter and will accept only a thing called an awaitable, which is defined as one of these (exclusively!): <br />A coroutine (i.e., the result of a called async def function) <br />OR <br />any object implementing the __await__() special method. That special method must return an iterator.</p><p>It is useful to look at how coroutines may be fed exceptions. This is most commonly used for cancellation: when you call task.cancel(), the event loop will internally use coro.throw() to raise asyncio.CancelledError inside your coroutine.<br /><br />>>> import asyncio<br />>>> async def f():<br />... try:<br />... while True: await asyncio.sleep(0)<br />... except asyncio.CancelledError: <br />... print('I was cancelled!') <br />... else:<br />... return 111<br />>>> coro = f()<br />>>> coro.send(None)<br />>>> coro.send(None)<br />>>> coro.throw(asyncio.CancelledError) <br />I was cancelled! <br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br />StopIteration <br /><br /><b>Event Loop</b><br />==================</p><p>Further reading/videos:<br />1. Dave Beazley, “Python Concurrency from the Ground Up: LIVE!”,<br /><br /></p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-80144003323692899142021-01-29T04:43:00.001+05:302021-01-29T04:43:04.391+05:30Data Center Books<p> https://learning.oreilly.com/library/view/enterprise-data-center/0130473936/</p><p>https://learning.oreilly.com/library/view/energy-efficient-computing-and/9781786301857/</p><p>https://learning.oreilly.com/library/view/grow-a-greener/9781587059919/</p><p>https://learning.oreilly.com/library/view/build-the-best/1587051826/</p><p>https://learning.oreilly.com/library/view/data-center-fundamentals/9781587140754/</p><p>https://learning.oreilly.com/library/view/ccna-data-center/9781118661260/</p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-19972502333574236322020-09-18T01:09:00.042+05:302021-03-15T05:18:35.635+05:30Designing Data Intensive Applications: Notes<p><span style="font-family: inherit; font-size: medium;"><a href="https://learning.oreilly.com/library/view/designing-data-intensive-applications">https://learning.oreilly.com/library/view/designing-data-intensive-applications</a><br />--------------<br />Hard disks are reported as having a mean time to failure (MTTF) of about 10 to 50 years [5, 6]. Thus, on a storage cluster with 10,000 disks, we should expect on average one disk to die per day.<br /><br />Rule of thumb. Number of failures per day = Total Disks(hardware)/MTTF in days<br />So here 10k/10k(30 years) = 1.<br />-------------<br />Algorithms to calculate a good approximation of percentiles at minimal CPU and memory cost are: forward decay, t-digest or HdrHistogram.<br />-----------<br />Chapter 2: Data Models and Query Languages: Summary<br /></span></p><p><span style="font-family: inherit; font-size: medium;">Data models are a huge subject, and in this chapter we have taken a quick look at a broad variety of different models. We didn’t have space to go into all the details of each model, but hopefully the overview has been enough to whet your appetite to find out more about the model that best fits your application’s requirements.</span></p><p><span style="font-family: inherit; font-size: medium;">Historically, data started out being represented as one big tree (the hierarchical model), but that wasn’t good for representing many-to-many relationships, so the relational model was invented to solve that problem. More recently, developers found that some applications don’t fit well in the relational model either. New nonrelational “NoSQL” datastores have diverged in two main directions:</span></p><p><span style="font-family: inherit; font-size: medium;">Document databases target use cases where data comes in self-contained documents and relationships between one document and another are rare.</span></p><p><span style="font-family: inherit; font-size: medium;">Graph databases go in the opposite direction, targeting use cases where anything is potentially related to everything.</span></p><p><span style="font-family: inherit; font-size: medium;">All three models (document, relational, and graph) are widely used today, and each is good in its respective domain. One model can be emulated in terms of another model—for example, graph data can be represented in a relational database—but the result is often awkward. That’s why we have different systems for different purposes, not a single one-size-fits-all solution.</span></p><p><span style="font-family: inherit; font-size: medium;">One thing that document and graph databases have in common is that they typically don’t enforce a schema for the data they store, which can make it easier to adapt applications to changing requirements. However, your application most likely still assumes that data has a certain structure; it’s just a question of whether the schema is explicit (enforced on write) or implicit (assumed on read).</span></p><p><span style="font-family: inherit; font-size: medium;">Each data model comes with its own query language or framework, and we discussed several examples: SQL, MapReduce, MongoDB’s aggregation pipeline, Cypher, SPARQL, and Datalog. We also touched on CSS and XSL/XPath, which aren’t database query languages but have interesting parallels.<br />-------------------------------<br /><br />Chapter 3: Storage and retrieval<br /><br />As a rule of thumb, LSM-trees(Log structured merge tree) are typically faster for writes, whereas B-trees are thought to be faster for reads. Reads are typically slower on LSM-trees because they have to check several different data structures and SSTables at different stages of compaction.<br /><br />A downside of log-structured storage is that the compaction process can sometimes interfere with the performance of ongoing reads and writes.<br /><br />An advantage of B-trees is that each key exists in exactly one place in the index, whereas a log-structured storage engine may have multiple copies of the same key in different segments. This aspect makes B-trees attractive in databases that want to offer strong transactional semantics: in many relational databases, transaction isolation is implemented using locks on ranges of keys, and in a B-tree index, those locks can be directly attached to the tree.<br /><br />B-trees are very ingrained in the architecture of databases and provide consistently good performance for many workloads, so it’s unlikely that they will go away anytime soon. In new datastores, log-structured indexes are becoming increasingly popular. There is no quick and easy rule for determining which type of storage engine is better for your use case, so it is worth testing empirically.<br /><br />----------</span></p><p><span style="font-family: inherit; font-size: medium;">Indexing: A compromise between a clustered index (storing all row data within the index) and a nonclustered index (storing only references to the data within the index) is known as a covering index or index with included columns, which stores some of a table’s columns within the index.This allows some queries to be answered by using the index alone (in which case, the index is said to cover the query)<br /><br />In MySQL’s InnoDB storage engine, the primary key of a table is always a clustered index, and secondary indexes refer to the primary key (rather than a heap file location).<br />-------<br />Data warehousing (OLAP as opposed to OLTP)<br />Column storage: Column compression using bitmaps. <br />We are going to duplicate data anyways so why not sort each copy of data differently and for a given query choose the appropriate replica.</span></p><p><span style="font-family: inherit; font-size: medium;">--------</span></p><p><span style="font-family: inherit; font-size: medium;">Chapter 4: Encoding etc.<br />Protobuf/Thrift vs Avro => all are binary encodings.<br />Avro better suited for dynamic schema.<br />Parquet: column oriented format.<br />Forward/Backward compatibility<br />Avro - Writer's and Reader's schemas can be different. Avro has a way to resolve the differences.</span></p><p><span style="font-family: inherit; font-size: medium;">--------------------<br />Chapter 5: Replication</span></p><p><span style="font-family: inherit; font-size: medium;">single-leader, multi-leader, leaderless<br />synchronous, asynchronous, semi-synchronous(one follower is sync, others async)<br />Mysql => statement based replication vs row based replication. Row based replication is used when there is non-determinism in the statement(for e.g. NOW() or RAND())<br /><br />How to ensure read-after-write consistency?<br />Route the requests to the leader for a while after an update is made by the user as the followers might be lagging.<br />Issues to consider: cross device requests, devices might be on different networks and hence routed to different DCs. Or the client can remember update timestamp and leader can route to the follower which is up-to-date till that timestamp.<br /><br />Montonic reads - a guarantee that user won't see content going backwards in time. Stronger than eventual consistency but weaker than strong consistency. One way to ensure is to always read from the same replica for a given user.</span></p><p><span style="font-size: medium;">Consistent prefix reads - this guarantee says that if a sequence of writes happens in a certain order, then anyone reading those writes will see them appear in the same order.<br /><br />Multi leader configuration is a dangerous territory which should be avoided whenever possible. But is unavoidable at times - for e.g. Calendar apps(or Collaborative editing apps like Google Docs) on various devices - each device has a local DB which has to accept the write even when offline. It's multi-leader config taken to extreme wherein each device is a DC.<br /><br />For collaborative editing - you either take a lock on the document or make the unit of change very small - e.g. one keystroke.<br /><br />CouchDB aims to make multi-leader config simpler.</span></p><p><span style="font-size: large;">CONFLICT AVOIDANCE - The simplest strategy for dealing with conflicts is to avoid them: if the application can ensure that all writes for a particular record go through the same leader, then conflicts cannot occur. Since many implementations of multi-leader replication handle conflicts quite poorly, avoiding conflicts is a frequently recommended approach.<br /><br />CONFLICT RESOLUTION APPROACH - Last write wins(LWW - based on the highest timestamp(or id))<br /><br />Some databases have a provision for conflict handler when the conflict is detected at read or write time.</span></p><p><span style="font-size: medium;">There has been some interesting research into automatically resolving conflicts caused by concurrent data modifications. </span></p><p><span style="font-size: medium;">Conflict-free replicated datatypes (CRDTs) are a family of data structures for sets, maps (dictionaries), ordered lists, counters, etc. that can be concurrently edited by multiple users, and which automatically resolve conflicts in sensible ways. <br /><br />Mergeable persistent data structures - track history explicitly, similarly to the Git version control system, and use a three-way merge function (whereas CRDTs use two-way merges). <br /><br />Operational transformation - is the conflict resolution algorithm behind collaborative editing applications such as Etherpad and Google Docs. It was designed particularly for concurrent editing of an ordered list of items, such as the list of characters that constitute a text document.</span></p><p><span style="font-size: medium;">LEADERLESS REPLICATION - came back in vogue after Amazon's DynamoDB.<br /><br />In some systems, clients send requests to all the nodes whereas in others there is a co-ordinator node.<br /><br />Writes are sent to all 3 replicas, if one node is down the client would still proceed since it received 2 OKs. While reading, the read requests should go to all the 3 nodes and based on the version of the record, client would decide which one to use.<br /><br />Read repair - when a client detects a stale version from a replica, it writes back the newer version back to the replica. But it won't work well for values which ain't frequently read.<br /><br />Anti-Entropy process - a process which keeps checking for differences in replicas in background.<br /><br />Quorum Condition - If there are 'n' replicas and you wait for 'w' writes and 'r' reads to be acked you are guaranteed to get the latest value as long as w + r > n. 'w' and 'r' can be configured based on the workload. For e.g. w = n and r = 1 for read heavy loads, but here the writes will fail even if one node is unavailable.<br /><br />But even with w + r > n there are edge cases when the replicas may end up with old writes, for e.g. concurrent writes, concurrent r+w, writes partially succeeding but not rolled back, a failed node restored with an older replica etc.<br /><br />Sloppy quorom and hinted handoff - for a write, if the designated 'w' nodes for that record are not reachable, you can write to any 'w' nodes and once the 'home' nodes come back up the records are 'handed off'.<br /><br />Merging records - for e.g. multiple clients adding items to shopping cart. Increment the version every time the cart is updated. For deleted items, add a tombstone marker which tells us the version in which the item was deleted. Handling it with multiple replicas calls for 'version vectors' or 'dotted version vectors'.</span></p><p><span style="font-family: inherit; font-size: medium;">---------------------------</span></p><p><span style="font-size: medium;"><span style="font-family: inherit;">Chapter 6: Partitioning</span><br /><br />SKEWED LOAD - Today, most data systems are not able to automatically compensate for such a highly skewed workload, so it’s the responsibility of the application to reduce the skew. For example, if one key is known to be very hot, a simple technique is to add a random number to the beginning or end of the key. Just a two-digit decimal random number would split the writes to the key evenly across 100 different keys, allowing those keys to be distributed to different partitions. <br /><br />However, having split the writes across different keys, any reads now have to do additional work, as they have to read the data from all 100 keys and combine it. This technique also requires additional bookkeeping: it only makes sense to append the random number for the small number of hot keys; for the vast majority of keys with low write throughput this would be unnecessary overhead. Thus, you also need some way of keeping track of which keys are being split.<br /><br />SECONDARY INDEXES - A lot of DBs don't support secondary indexes but they are the raison d’être of search servers such as Solr and Elasticsearch. The problem with secondary indexes is that they don’t map neatly to partitions. There are two main approaches to partitioning a database with secondary indexes: document-based partitioning and term-based partitioning.<br /><br />PARTITIONING SECONDARY INDEXES BY DOCUMENTS (Local Indexing) - In this indexing approach, each partition is completely separate: each partition maintains its own secondary indexes, covering only the documents in that partition. For that reason, a document-partitioned index is also known as a local index. If you want to search for red cars, you need to send the query to all partitions, and combine all the results you get back. This approach is known as scatter/gather, and it can make read queries on secondary indexes quite expensive. Even if you query the partitions in parallel, scatter/gather is prone to tail latency amplification Nevertheless, it is widely used: MongoDB, Riak [15], Cassandra [16], Elasticsearch [17], SolrCloud [18], and VoltDB [19] all use document-partitioned secondary indexes. Most database vendors recommend that you structure your partitioning scheme so that secondary index queries can be served from a single partition, but that is not always possible, especially when you’re using multiple secondary indexes in a single query (such as filtering cars by color and by make at the same time).<br /><br />PARTITIONING SECONDARY INDEXES BY TERMS (Global Indexing) - For e.g. entities with colors starting a-d reside in one partition and e-h in another. This means that reads are faster as you have to search only one partition but writes are expensive as you have to update 2 partitions(more if you have more than 1 secondary index).<br /><br />REBALANCING PARTITIONS (Assigning partitions to nodes)<br />1. Don't do Hash Mod N. If we divide hashes keys into ranges and assign to a node by doing hash mod N a lot of keys will have to be reassigned if number of nodes changes. N is number of nodes here.<br />2. Fixed number of partitions - for e.g. fix the number of partitions to 1000 for a 10 node cluster. Assignment of a key to a partition is always fixed. If a new node is added, some partitions will be moved to the new node. If a node goes down, its partitions will be distributed to the rest of the nodes. So essentially only partitions move around to nodes, assignment of a key is always fixed to a partition. This approach won't work well if the dataset size is highly variable. <br />3. Dynamic partitioning - start with a fixed set of partitions but keep merging and splitting based on the partition sizes.<br /><br />So in 2. size of partitions is proportional to the dataset size where in 3. the number of partitions is proportional to the dataset size.<br /><br />4. Fixed number of partitions per node - if a new node joins it splits some existing partitions and moves to itself. So the load per node is fairly balanced.<br /><br />REQUEST ROUTING(Service Discovery) - to which IP/Port request for a key is routed?<br />3 approaches - 1. A routing layer 2. Partition aware client 3. Send to any node which is responsible for forwarding it correctly. Often the metadata is tracked by ZooKeeper which is informed by every node of its partitions. Updates published by ZooKeeper can be consumed by the routing tier or clients.<br />Often Gossip protocol is used when using 3.<br /><br />Summary:<br />Key range partitioning good for range scan queries as the partition can keep the keys sorted.<br />Hash based partitioning destroys the ordering but achieves fair distribution of load.<br /></span><span style="font-family: inherit; font-size: large;">---------------------------</span></p><p><span style="font-family: inherit; font-size: medium;">Chapter 7: Transactions<br />IMPLEMENTING READ COMMITTED - One way is to acquire lock for reads but that's slow so DBs hold the value before the writer takes a lock which is given to readers. After the transaction is committed, DB updates it to the new value.<br /><br />However, Read Committed can't prevent read skew and non-repeatable reads. It causes problems with backups, since they may reflect partial information. Or any large analytic/integrity queries would fail too. SNAPSHOT ISOLATION is the solution here. It ensures that every transaction reads from a consistent snapshot of the DB.<br /><br />IMPLEMENTING SNAPSHOT ISOLATION - each row has multiple versions. Each row has created_by = id_of_txn_which_created_it and deleted_by = id_of_txn_which_deleted_it. An update is treated as Delete + Create. A read transaction can't see rows which were modified by in-progress/aborted/committed_with_a_bigger_txn_id. It's called MVCC - multi version concurrency control. It's an enhancement over the implementation of READ COMMITTED. Garbage collection gets rid of older rows.<br /><br />How to update indexes - one way is to have the index point to all the row versions and filter out those which are not visible to the current transaction. There are other better ways though.<br /><br />Snapshot isolation is often known as Serializable/Repeatable read.<br /><br />LOST UPDATE - read-modify-write done by two txns in parallel. Only one will survive. Some DBs ensure that value = value + 1 is atomic and concurrency safe. Need to watch out for ORMs which use read-modify-write and not DB's atomic operations. Another alt is to take locks on those rows. Some DBs(not MySql) can detect lost updates using Snapshot Isolation and abort the offending txn.<br /><br />Another way to handle lost updates is Compare and Set.<br />Update table set val = 'new' where id =1 and val = 'old';<br />So it will work only if no one has modified the value since it was first read.<br />But some DBs may implement it incorrectly if they read values from an old snapshot.</span></p><p><span style="font-family: inherit; font-size: medium;">Locks and Compare-and-Set will not work well in replicated systems where there are multiple copies of DB. So the onus is on DB/Application to figure out how to merge them well. Atomic operations can work well in replicated context if they are commutative. Last Write Wins is prone to lost updates<br /><br />WRITE SKEW - This effect, where a write in one transaction changes the result of a search query in another transaction, is called a phantom . Snapshot isolation avoids phantoms in read-only queries, but in read-write transactions like the examples we discussed, phantoms can lead to particularly tricky cases of write skew.<br />For e.g. at least one doctor has to be on-call. If two on-call doctors go off at the same time, you may end up with zero on-calls. Or multiple bookings for same room. In a multiplayer game, two players might move to same location. Double spending - where a user might the same money twice.<br />The problem is that we are reading some rows and two different transactions update two different rows. So there is no single object to attach a lock to.<br /><br />One solution is Materializing Conflicts - for e.g. for room booking case create a table with all possible time slots(15 minutes) and lock the required slots for a given transaction. But it's a difficult approach in general.<br /><br />Another solution is SERIALIZABLE ISOLATION LEVEL. It is usually regarded as the strongest isolation level. Here the DB prevents all possible race conditions. Net outcome is as if all transactions ran serially one after other. If it's so good, why isn't everyone using it? To understand that, let's explore how could it be possibly implemented. There are 3 ways:<br /><br />1. Executing the transactions actually serially - it's been possible since around 2007 as the RAM has become cheaper and the entire active dataset can be kept it memory. Redis does it. Single threaded execution loop. But throughput it limited to one CPU core. To make single threaded executions efficient, Stored procedure transactions are used in which the entire transaction code is stored as a procedure with the DB and only the parameters are passed. This avoids network costs for the regular interactive transactions where the client first issue a query and then submits another query based on its results and so on. Also, it removes the need for keeping many transactions open.<br /><br />But Stored procedures are not very popular in general due to - archaic programming languages, harder debugging, version control, deployment, metrics collection. A DB is much more performance sensitive than an application server and again performance debugging is difficult.</span></p><p><span style="font-family: inherit; font-size: medium;">But these can be overcome. Modern day stored procedures use general purpose programming languages such as Lua for Redis, others us Java etc. VoltDB also uses stored procs for replication assuming they are deterministic(ref date/time functions). To use all the CPU cores, you can partition your data and allocate CPU cores to them. But cross partition transactions would be slower(by an order of magnitude in VoltDB). When the data has multiple secondary indexes, cross partition transactions would be much slower. For write heavy workloads, these things are not good.<br /><br />So serial execution of transactions can be used when: active dataset fits into memory, each transaction is small and fast so as to not block others, write load is low enough to be served with single CPU core, cross partition transactions slowness can be tolerated.<br /></span></p><div style="text-align: left;"><br />2PL - 2 PHASE LOCKING(different from 2 phase commit): This was the only choice for almost 30 years to achieve serialization. In Snapshot Isolation the mantra is <i>writers never block readers and readers never block writers</i> whereas in 2PL <i>writers block both readers/writers and readers block both readers/writers</i>. Implementation of 2PL involves taking locks in shared/exclusive mode and detecting deadlocks. Details:<br />1. If a txn wants to read an object it must acquire a shared lock. Several txns can hold the shared lock but if a txn already holds an ex-lock the reading txns must wait for it to finish.<br />2. If a txn wants to write it must obtain an ex-lock, no other txn can hold any lock on this object whether ex or shared. This txn must wait for all the existing locks to be released.</div><div style="text-align: left;"><br />So it has performance problems in which one long transaction can block all others.<br /><br />Predicate locks - it applies to results of a query. For e.g. the room booking example - the txn must take the lock on the query(select * from bookings where room_id =1 and start_date >= and end_date <= y). It will ensure that only one txn is able to make a booking since it has got the lock on this predicate. The key idea is that the predicate lock applies to even those objects which are not yet in the DB.<br /><br />Predicate locks have poor performance. When there are many active transactions, matching for predicates becomes very time consuming.<br />Due to the poor performance of predicate locks many DBs implement index range locking to achieve 2PL. A query for room booking can be generalized for all bookings with that room_id or all bookings for the time range. And the corresponding range in the index can be locked. If there is no index suitable for this query, a shared lock can be taken on the entire table.<br /><br />So far it seems that concurrency control in DBs has 2 options - stronger Serializable isolation with poor performance(2PL) or poor scalability(serial execution in-memory single-core Redis) OR weak isolation with better performance but many race conditions.<br /><br />But there is a new(2008) algo - Serializable Snapshot Isolation(SSI) - which has full serializability with only small performance penalty as compared to Snapshot Isolation. It might become new default in future.<br /><br />Optimistic Concurrency Control - Proceed with the transactions in the hope that nothing bad will happen and before committing make sure that nothing bad happened.<br /><span style="font-family: inherit; font-size: medium;">-----------------------------</span><br /><span style="font-family: inherit; font-size: medium;">Chapter 8: The Trouble with Distributed Systems:</span><br /><span style="font-family: inherit; font-size: medium;">Monotonic clock vs time_of_the_day_clock => Monotonic clock is guaranteed to move forward as opposed to the time of the day clock which may jump back in time.</span><br /><span style="font-family: inherit; font-size: medium;">Issues with NTP servers: network congestion might result in delayed updates, NTP servers might be firewalled, some NTP servers might give wrong time. Clock drift should be closely monitored. Leap seconds might cause issues like time going back.</span><br /><br /><span style="font-family: inherit; font-size: medium;">Due to all these factors, time returned by the system clock has a confidence interval or uncertainty attached to it. </span><span style="font-size: medium;">Google’s TrueTime API in Spanner, which explicitly reports the confidence interval on the local clock. When you ask it for the current time, you get back two values: [earliest, latest], which are the earliest possible and the latest possible timestamp. Based on its uncertainty calculations, the clock knows that the actual current time is somewhere within that interval. The width of the interval depends, among other things, on how long it has been since the local quartz clock was last synchronized with a more accurate clock source.<br /></span><br />Consider Snapshot isolation for transactions happening in multiple DB nodes. Snapshot isolation needs to know which txn happened earlier and which one later? But how do you order these txns across nodes, since you don't know for sure whether clocks are in sync across nodes? Using the TrueTime API's confidence intervals, you can decide if they don't overlap. If they do overlap, then it's tricky. Google's DCs have GPS receivers to ensure that clocks are synced upto 7 ms.<br /><br />Process Pauses:<br />while (true) { </div><div style="text-align: left;"> request = getIncomingRequest();<br /> // Ensure that the lease always has at least 10 seconds remaining </div><div style="text-align: left;"> if (lease.expiryTimeMillis - System.currentTimeMillis() < 10000) { <br /> lease = lease.renew(); <br /> } <br /> if (lease.isValid()) {<br /> process(request); <br /> }<br />}<br /><br />If there is a single leader per partition who has to renew the lease periodically and the above code is being used, there are multiple issues. Firstly, the lease expiry time and current time are computed on different machines so drift will come into play. Also the monotonic clock is not being used. Secondly, process may pause for various reasons(GC, Live VM Migration, Thread Context Switch, IO waits - disk/network) and in the meantime another node may take the lease and do something unsafe.<br /><br />Limiting the scope of Garbage collection - one approach is to do GC only for short lived objects and periodically restart process to deal with long lived objects while redirecting traffic to other nodes. Other is to not handle traffic on a node when GC is about to start. Something like rolling upgrades.<br /><br />Fencing - client 1 takes lock to write to a file but GC starts and its lease expires, in the meantime client 2 takes the lock and writes to the file, later client 1 wakes up and writes again. To avoid this the file storage server could ask for an auto incrementing lock token. Client 2 would send 34 and client 1 33 so client 1's request would be rejected.<br /><br />So far we have assumed that nodes are unreliable but honest. What if some nodes "lie"? May be they were compromised. This leads us to Byzantine systems - A system is Byzantine fault-tolerant if it continues to operate correctly even if some of the nodes are malfunctioning and not obeying the protocol, or if malicious attackers are interfering with the network. But in most server-side data systems, the cost of deploying Byzantine fault-tolerant solutions makes them impracticable.<br /><span style="font-family: inherit; font-size: medium;">-------------------<br />Chapter 9: Consistency and Consensus<br />Linearizability - once a more recent value has been read, subsequent reads shouldn't get the older value. If a DB offers both </span><span style="font-size: medium;">Linearizability and Serializability, it's called strict serializability.</span><br /><span style="font-size: large;">Linearizability essentially means</span><span style="font-size: medium;"> “ such systems behave as though there is only a single copy of the data, and all operations on it are atomic,”</span><br /><br /><span style="font-size: medium;">When is </span><span style="font-size: large;">Linearizability useful? Leader election. Once a leader has been elected, everyone should agree about who holds this lock. Distributed locking.<br /><br />Uniqueness constraints also require </span><span style="font-size: medium;">Linearizability. It also avoids seat overbooking, double spend of balance etc.</span><br /><br /><span style="font-size: medium;">Cross-channel timing dependencies - Often these problems arise when there is an additional communication channel between 2 services.</span><br /><br /><span style="font-size: medium;">Implementing Linearizable systems - single leader replication is possibly linearizable but not multi leader and leaderless systems. As multi leader systems are quite useful in the multi DC setup, insisting on Linearizability would stop us from getting stronger systems. Thus, applications that don’t require linearizability can be more tolerant of network problems. This insight is popularly known as the CAP theorem.<br /><br />A better way of phrasing CAP would be either Consistent or Available when Partitioned.<br /><br />Although linearizability is a useful guarantee, surprisingly few systems are actually linearizable in practice. For example, even RAM on a modern multi-core CPU is not linearizable. If one thread writes and another thread reads afterwards it might hit the memory cache. The reason to tolerate this is <i>performance</i>. Same is true for many DBs which avoid </span><span style="font-size: large;">Linearizability.<br /></span><span style="font-size: medium;"><br />Can’t we maybe find a more efficient implementation of linearizable storage? It seems the answer is no: Attiya and Welch prove that if you want linearizability, the response time of read and write requests is at least proportional to the uncertainty of delays in the network. In a network with highly variable delays, like most computer networks, the response time of linearizable reads and writes is inevitably going to be high.<br /><div style="font-size: medium;"><span style="font-size: medium;"><br />If a system obeys the ordering imposed by causality, we say that it is <b>causally consistent</b>. For example, snapshot isolation provides causal consistency: when you read from the database, and you see some piece of data, then you must also be able to see any data that causally precedes it (assuming it has not been deleted in the meantime).<br /><br /><b>Linearizability</b> - In a linearizable system, we have a <b>total order</b> of operations: if the system behaves as if there is only a single copy of the data, and every operation is atomic, this means that for any two operations we can always say which one happened first.</span></div><div style="font-size: medium;"><span style="font-size: medium;"><br /><b>Causality</b> We said that two operations are concurrent if neither happened before the other (see “The “happens-before” relationship and concurrency”). Put another way, two events are ordered if they are causally related (one happened before the other), but they are incomparable if they are concurrent. This means that causality defines a <b>partial order</b>, not a total order: some operations are ordered with respect to each other, but some are incomparable.</span></div><div><span style="font-size: medium;"><br /><span style="font-size: small;">As per the above there are no concurrent operations in a linearizable datastore: there must be a single timeline along which all operations are totally ordered. Concurrency would mean that the timeline branches and merges again—and in this case, operations on different branches are incomparable (i.e., concurrent).</span><br /><br /><span style="font-size: small;">So</span><b style="font-size: medium;"> Linearizability </b><span style="font-size: small;">implies </span><b style="font-size: medium;">Causality</b><span style="font-size: small;"> and hence stronger than it. In many cases, systems that appear to require linearizability in fact only really require causal consistency, which can be implemented more efficiently. Based on this observation, researchers are exploring new kinds of databases that preserve causality, with performance and availability characteristics that are similar to those of eventually consistent systems [49, 50, 51]. As this research is quite recent, not much of it has yet made its way into production systems, and there are still challenges to be overcome [52, 53]. However, it is a promising direction for future systems.</span><br /><br /><span style="font-size: small;">Generating total order with sequence numbers - use Lamport Timestamps.</span><br /><br /><span style="font-size: small;">Atomic commit and 2 phase commit - </span><br />Two-phase commit is an algorithm for achieving atomic transaction commit across multiple nodes—i.e., to ensure that either all nodes commit or all nodes abort.<br /><br />2PC uses a new component that does not normally appear in single-node transactions: a coordinator (also known as transaction manager). The coordinator is often implemented as a library within the same application process that is requesting the transaction.<br /><br />A 2PC transaction begins with the application reading and writing data on multiple database nodes, as normal. We call these database nodes participants in the transaction. When the application is ready to commit, the coordinator begins phase 1: it sends a prepare request to each of the nodes, asking them whether they are able to commit. The coordinator then tracks the responses from the participants: If all participants reply “yes,” indicating they are ready to commit, then the coordinator sends out a commit request in phase 2, and the commit actually takes place. If any of the participants replies “no,” the coordinator sends an abort request to all nodes in phase 2.<br /><br /></span></div><span style="font-size: medium;">========================</span><br /><br /></span></div><div style="text-align: left;"><span style="font-family: inherit; font-size: medium;">--------------------------------<br /></span><span style="font-family: inherit; font-size: large;">Chapter 10: Batch Processing</span></div><p><span style="font-family: inherit; font-size: medium;"><br /></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Services vs Batch processing vs Stream processing</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Services => Online, Req/Resp, quality measured by response time</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Batch processing => Offline job processing, measured by throughput</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Stream processing => Near real-time, somewhere between online and offline</span></span></p><p class="p2" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 22px;"><span style="font-family: inherit; font-size: medium;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Batch processing =></span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Unix tools, piped processing, independent of each other but work well with each other due to the same interface</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">MapReduce is similar, uses HDFS which is shared-nothing architecture, unlike SAN and NAS.</span></span></p><p class="p2" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 22px;"><span style="font-family: inherit; font-size: medium;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">First, each map task partitions its output by reducer, based on the hash of the key. Each of these partitions is written to a sorted file on the mapper’s local disk.</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Whenever a mapper finishes reading its input file and writing its sorted output files, the MapReduce scheduler notifies the reducers that they can start fetching the output files from that mapper. The reducers connect to each of the mappers and download the files of sorted key-value pairs for their partition.</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">The reduce task takes the files from the mappers and merges them together, preserving the sort order. Thus, if different mappers produced records with the same key, they will be adjacent in the merged reducer input. The reducer is called with a key and an iterator that sequentially scans over all records with the same key (which may in some cases not all fit in memory).</span></span></p><h1 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 22px; text-align: left;"><span style="font-family: inherit; font-size: medium;"><br /></span></h1><p class="p2" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 22px;"><span style="font-family: inherit; font-size: medium;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Joins in batch processing:</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">Joins in case of a single entity in a normal online service are best handled with an index.</span></span></p><p class="p1" style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><span style="font-family: inherit; font-size: medium;">But in case of batch processing, we want to resolve all the entities.<br />Map side joins.<br />Reduce side joins.</span></span></p><p><span style="font-family: inherit; font-size: medium;">Comparing Hadoop to Distributed Databases:</span></p><div><span style="font-family: inherit; font-size: medium;">The biggest difference is that MPP(massively parallel processing) databases focus on parallel execution of analytic SQL queries on a cluster of machines, while the combination of MapReduce and a distributed filesystem provides something much more like a general-purpose operating system that can run arbitrary programs.</span></div><div><span style="font-family: inherit; font-size: medium;"><br /></span></div><div><span style="font-size: medium;"><span><span style="font-family: inherit;">And this is why MapReduce is designed to tolerate frequent unexpected task termination: it’s not because the hardware is particularly unreliable, it’s because the freedom to arbitrarily terminate processes enables better resource utilization in a computing cluster.</span><br /><br />In an environment where tasks are not so often terminated, the design decisions of MapReduce make less sense. <br /><br />So </span><span style="font-family: inherit;">Improvements over MapReduce:</span></span></div><p><span style="font-family: inherit; font-size: medium;">Dataflow Engines: Get rid of intermediate file writing to disk(materialization) and stream the output directly to the next job. This will work unless sorting is required.</span></p><p><span style="font-family: inherit; font-size: medium;">Pregel model for iterative processing on Graphs: Repeat until done. Very complicated to do it via MapReduce. For e.g. PageRank.</span></p><p><span style="font-family: inherit; font-size: medium;">=============================</span></p><p><span style="font-family: inherit; font-size: medium;">Chapter 11:</span></p><p><span style="font-size: medium;"><span style="font-family: inherit;">Stream processing</span><br /><br />In principle, a file or database is sufficient to connect producers and consumers: a producer writes every event that it generates to the datastore, and each consumer periodically polls the datastore to check for events that have appeared since it last ran. <br /><br />However polling becomes expensive if the datastore is not designed for this kind of usage. The more often you poll, the lower the percentage of requests that return new events, and thus the higher the overheads become. Instead, it is better for consumers to be notified when new events appear. Databases have traditionally not supported this kind of notification mechanism very well: relational databases commonly have triggers, which can react to a change but they are very limited in what they can do and have been somewhat of an afterthought in database design. Instead, specialized tools have been developed for the purpose of delivering event notifications.<br />-----------------------------------------------------------------<br /></span></p><h1 style="text-align: left;"><span style="font-size: medium;">One option is to directly connect producers with consumers: for e.g. TCP or UDP sockets.</span></h1><span style="font-size: medium;">Message brokers: specialized databases optimized for these scenarios.<br /><br />Typically message brokers don't persist messages after they have been delivered.<br /><br />Why can we not have a hybrid, combining the durable storage approach of databases with the low-latency notification facilities of messaging? This is the idea behind log-based message brokers.<br /></span><h1 style="text-align: left;"><span style="font-size: medium; font-weight: normal;">More recently, there has been growing interest in change data capture (CDC), which is the process of observing all data changes written to a database and extracting them in a form in which they can be replicated to other systems. CDC is especially interesting if changes are made available as a stream, immediately as they are written.</span></h1><div><span style="font-size: medium;">For example, you can capture the changes in a database and continually apply the same changes to a search index. If the log of changes is applied in the same order, you can expect the data in the search index to match the data in the database. The search index and any other derived data systems are just consumers of the change stream.</span><br /><br /><span style="font-size: medium;">Apache Kafka provides CDC connectors for various databases.</span><br /><br /><span style="font-size: medium;">Increasingly, databases are beginning to support change streams as a first-class interface, rather than the typical retrofitted and reverse-engineered CDC efforts.<br /><br />Command and event: Command is the user request to do something. Event is generated when the Command is found to be valid and successfully executed.<br /><br />Stream analytics systems sometimes use probabilistic algorithms, such as Bloom filters (which we encountered in “Performance optimizations”) for set membership, HyperLogLog for cardinality estimation, and various percentile estimation algorithms.<br /><br />This use of approximation algorithms sometimes leads people to believe that stream processing systems are always lossy and inexact, but that is wrong: there is nothing inherently approximate about stream processing, and probabilistic algorithms are merely an optimization.<br /><br /></span></div><span style="font-size: medium;"><br /></span><p></p>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-88267320827860135952020-09-06T01:39:00.001+05:302020-09-06T01:39:15.563+05:30free video editing tools<div dir="ltr">handbrake => burn subtitles into mp4<div>avidemux => cut video anywhere(not only the ends)<br>freemake => vob to mp4 converter</div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com2tag:blogger.com,1999:blog-5434031919110451140.post-17389231619040032692020-08-13T14:47:00.001+05:302020-08-13T14:50:14.774+05:30grep extract only matching text<div dir="ltr"> cat filename | grep --extended-regexp --only-matching --regexp="data-text=\"[a-zA-Z0-9 ]*\""<br /><div><br /></div><div>I had an html where I wanted to extract text like "data-text=abcd"</div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-23847506502018331282020-08-05T00:18:00.004+05:302020-08-05T00:18:39.343+05:30html video player with zooming<div><html><head><meta name="viewport" content="width=device-width"></head><body></div><div><style></div><div>.video-container {</div><div> position: absolute;</div><div> top: 0;</div><div> bottom: 0;</div><div> width: 100%;</div><div> height: 100%; </div><div> overflow: hidden;</div><div>}</div><div><br /></div><div>.video-container video {</div><div> /* Make video to at least 100% wide and tall */</div><div> min-width: 100%; </div><div> min-height: 100%; </div><div> </div><div> /* Setting width & height to auto prevents the browser from stretching or squishing the video */</div><div> width: auto;</div><div> height: auto;</div><div> </div><div> /* Center the video */</div><div> position: absolute;</div><div> top: 50%;</div><div> left: 50%;</div><div> transform: translate(-50%,-50%);</div><div>}</div><div><br /></div><div><br /></div><div>////////////////////////////////////////</div><div><br /></div><div>body { </div><div> background: #000;</div><div> color: #fff;</div><div> font-family: 'Oxygen', sans-serif;</div><div>}</div><div> </div><div>.overlay {</div><div> background: rgba(0,0,0,0.5);</div><div> position: absolute;</div><div> bottom: 0;</div><div> left: 0;</div><div> right: 0;</div><div> width: 95%;</div><div> max-width: 45em;</div><div> margin: auto auto 1em;</div><div> box-sizing: border-box;</div><div> padding: 2em;</div><div> line-height: 1.5;</div><div> text-align: center;</div><div> </div><div> :last-child { margin-bottom: 0; }</div><div><br /></div><div> h1 {</div><div> font-size: 14pt;</div><div> font-weight: normal;</div><div> text-shadow: 0 0 .3em #000;</div><div> margin: 0 0 1em;</div><div> }</div><div><br /></div><div> p {</div><div> font-size: 11pt;</div><div> text-shadow: 0 0 .3em #000;</div><div> margin: 1em 0;</div><div> }</div><div> a {</div><div> color: #fff;</div><div> }</div><div>}</div><div><br /></div><div>code { font-family: monospace; }</div><div></style></div><div><br /></div><div><div class="video-container"></div><div><video controls="" autoplay="" name="media"><source src="mp4link" type="video/mp4"></video></div><div></div></div><div></body></html></div>oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-58460722812162671972020-07-27T14:07:00.001+05:302020-07-27T14:07:25.392+05:30Ignoring whitespace diff in Git GUIEdit => Options => Additional Diff Parameters(Both in NRP repository
<br>and Global) = -w --ignore-all-spaceoneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-24593264310663618372020-04-23T16:21:00.003+05:302020-04-23T16:21:59.882+05:30Create a self signed cert, convert .pem to .cer<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Create private key:</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
openssl genrsa -out
privkey.pem 2048<br />
</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
See what's inside
the private key:<br />
openssl rsa -noout -text -in privkey.pem<br />
<br />
Create public key using the private key</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
openssl req -new
-x509 -key privkey.pem -out cacert.pem -days 3650<br />
<br />
See what's inside the public key:<br />
openssl x509 -in cacert.pem -text<br />
<br />
Convert public key from .pem to .cer</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
openssl x509 -inform
PEM -in cacert.pem -outform DER -out a.cer</div>
<br /></div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-15245232082717190532020-04-21T01:10:00.000+05:302020-04-26T01:45:57.344+05:30System design interviews<div dir="ltr" style="text-align: left;" trbidi="on">
Some common points:<br />
1. Logging - async logging - traceId which is passed to every layer to enable better debugging and log collation. Hot/cold storage for logs - ssd(recent logs) vs hdd(older logs)<br />
<br />
2. For media(videos) - how to handle long connections - LB will maintain state while backend undergoing change - for real time streaming, it's ok - just resume from current time. For static videos, resume from last known location. Use DSR, where LB is bypassed in the reverse direction and data directly goes to client.<br />
<br />
3. Analytics - separate storage for archival. EMR jobs. HDFS. Pig scripts.<br />
<br />
4. Caching - Distributed vs. each node having the full replica. In the second case, each one just listens for updates which flow from the master. How much RAM does each node have? Prevents a hop but not scalable. (Aerospike vs Redis distributed). Routing in the first case - client frequently syncs with the master for latest routing table updates so as to directly hit the backend vs. touching master for every call.<br />
<br />
5. Queuing - Kafka. Producers/consumers/intermediate storage - each layer should be able to independently scale. How to partition for listeners? How much data to store waiting for consumers to consume. Each consumer should be able to go back and forth in the queue irrespective of how much has been processed.<br />
<br />
6. LB - DNS(round robin/geo aware) -> L3(BGMP)->L4(TCP)->L7(http/redis). Passthru vs connection termination. DSR? SSL termination? Http2?<br />
<br />
7. Search - Lucene/Autocomplete based on tries/<br />
<br />
8. Social media content - Taiji - User clique based DC routing. A user is likely to consume content generated by friends. So cache data for a clique in the same DC.<br />
<br />
9. Separate out read/write flows - scale indep'ly. 100:1 read:write ratio.<br />
<br />
10. Redundancy - region/zone<br />
<br />
11. News feed - push/pull/hybrid<br />
<br />
12. DB partitioning - distribute tables? foreign keys? joins?<br />
<br />
13. Security,permissions,file sharing.<br />
<br />
14. Dropbox - files split in chunks - to better manage retries. Exponential backoff. Mobile clients do pull based sync to preserve data/battery. Response queue for each client. Request queue global. Inline Dedupe vs post processing dedupe.<br />
<br />
15. Yelp/Uber - quad tree<br />
<br />
16. Ticketmaster - transaction isolation levels<br />
<br />
17. </div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-14252313811408605582020-03-13T00:19:00.001+05:302020-03-13T00:19:33.040+05:30Multi tier load balancing with Linux<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="https://vincent.bernat.ch/en/blog/2018-multi-tier-loadbalancer" style="font-family: Calibri; font-size: 14pt;">Multi-tier
load-balancing with Linux</a><br />
<br />
<div style="border-width: 100%; direction: ltr;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 6.4708in;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 6.4708in;">
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<span style="font-weight: bold;">Summary in my words:</span><br />
</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Multiple L7 load
balancers talk to multiple backend servers. Backend servers don't have to worry
about path MTU discovery, TCP congestion control algorithms, avoidance of the
TIME-WAIT state and various other low-level details. But how do you load balance
the L7 LBs?</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
L3 load balancing
can help. All L7 LBs can advertise the same IP to their nearby routers. With a
combination of ECMP and BGP protocols, routers can load balance the packets to
L7 LBs. But adding/removing an L7 LB in this setup breaks existing connections(which
is problematic for long connections like downloads) since routers don't use
consistent hashing. Also, if one router becomes unavailable, other routers may
forward the packets to a different LB since every router decides
individually.<br />
<br />
To mitigate this, use L4 LBs between L3 and L7. The purpose of this tier is to
ensure that all members take same scheduling decision for an incoming packet
based on destination IP and port. Consistent hashing is used to route packets
to L7 LBs so removal/addition of an L7 LB isn't problematic. Additionally L7
LBs return response directly to L3 routers(DSR) to improve perf.<br />
<br />
So, routers use ECMP + BGP to route packets to L4 LBs(all of which advertise
the same IP). L4 LBs use consistent hashing scheme to talk to L7 LBs. And L7
LBs return the response directly to L3.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<span style="font-weight: bold;">Actual Snippets from the article:</span></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
A combination
of:<br />
</div>
<ol style="direction: ltr; font-family: Calibri; font-size: 14.0pt; font-style: normal; font-weight: normal; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="1">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;" value="1"><span style="font-family: Calibri; font-family: Calibri; font-size: 14.0pt; font-size: 14.0pt; font-style: normal; font-weight: normal;">ECMP routing </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 14.0pt;">Stateless L4 load-balancing </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 14.0pt;">Stateful L7 load-balancing</span></li>
</ol>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<span style="font-weight: bold;">L7 (Last Tier):</span><span style="mso-spacerun: yes;">
</span><br />
<br />
Working in the highest layers of the OSI model, it can also offer additional
services, like TLS-termination, HTTP routing, header rewriting, rate-limiting
of unauthenticated users, and so on. Being stateful, it can leverage complex
load-balancing algorithm. Being the first point of contact with backend
servers, it should ease maintenances and minimize impact during daily changes. </div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
It also terminates
client TCP connections. This introduces some loose coupling between the
load-balancing components and the backend servers with the following benefits: </div>
<ol style="direction: ltr; font-family: Calibri; font-size: 14.0pt; font-style: normal; font-weight: normal; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="1">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;" value="1"><span style="font-family: Calibri; font-family: Calibri; font-size: 14.0pt; font-size: 14.0pt; font-style: normal; font-weight: normal;">connections to servers can be
kept open for lower resource use and latency, </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 14.0pt;">requests can be retried
transparently in case of failure, </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 14.0pt;">clients can use a different
IP protocol than servers, </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 14.0pt;">and servers do not have to
care about path MTU discovery, TCP congestion control algorithms,
avoidance of the TIME-WAIT state and various other low-level details.</span></li>
</ol>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
As is, this solution
is not complete. We have just moved the availability and scalability problem
somewhere else. How do we load-balance the requests between the load-balancers?
<br />
<br />
<span style="font-weight: bold;">First tier: ECMP routing (L3)</span><br />
</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
On most modern
routed IP networks, redundant paths exist between clients and servers. For each
packet, routers have to choose a path. When the cost associated to each path is
equal, incoming flows are load-balanced among the available destinations. This characteristic
can be used to balance connections among available
load-balancers.<br />
<br />
If we assume you already have BGP-enabled routers available, ExaBGP is a
flexible solution to let the load-balancers advertise their availability.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br />
Unfortunately, this solution is not resilient when an expected or unexpected
change happens. Notably, when adding or removing a load-balancer, the number of
available routes for a destination changes. The hashing algorithm used by
routers is not consistent and flows are reshuffled among the available
load-balancers, breaking existing connections.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Moreover, each
router may choose its own routes. When a router becomes unavailable, the second
one may route the same flows differently which again breaks existing
connections.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
If you think this is
not an acceptable outcome, notably if you need to handle long connections like
file downloads, video streaming or websocket connections, you need an
additional tier. Keep reading!</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<span style="font-weight: bold;">Second tier: L4 load-balancing</span></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
The second tier is
the glue between the stateless world of IP routers and the stateful land of L7
load-balancing. It is implemented with L4 load-balancing. The terminology can
be a bit confusing here: this tier routes IP datagrams (no TCP termination) but
the scheduler uses both destination IP and port to choose an available L7
load-balancer. The purpose of this tier is to ensure all members take the same
scheduling decision for an incoming packet. There are two options: </div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<ol style="direction: ltr; font-family: Calibri; font-size: 14.0pt; font-style: normal; font-weight: normal; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="1">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;" value="1"><span style="font-family: Calibri; font-family: Calibri; font-size: 14.0pt; font-size: 14.0pt; font-style: normal; font-weight: normal;">stateful L4 load-balancing
with state synchronization across the members, </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 14.0pt;">or stateless L4
load-balancing with consistent hashing.</span></li>
</ol>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
The first option
increases complexity and limits scalability. We won’t use it. The second option
is less resilient during some changes but can be enhanced with a hybrid
approach using a local state.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
We use IPVS, a
performant L4 load-balancer running inside the Linux kernel, with Keepalived, a
frontend to IPVS with a set of healthcheckers to kick out an unhealthy
component. IPVS is configured to use the Maglev scheduler, a consistent hashing
algorithm from Google. Among its family, this is a great algorithm because it
spreads connections fairly, minimizes disruptions during changes and is quite
fast at building its lookup table. Finally, to improve performance, we let the
last tier—the L7 load-balancers—sends back answers directly to the clients
without involving the second tier—the L4 load-balancers. This is referred to as
direct server return (DSR) or direct routing (DR).<br /> <span style="font-size: 14pt;"> </span></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
With such a setup, we expect packets from a flow to be able to move freely
between the components of the first two tiers while sticking to the same L7
load-balancer.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<span style="font-weight: bold;">Tier 0: DNS load-balancing </span></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Optionally, you can
add DNS load-balancing to the mix. This is useful either if your setup is
spanned across multiple datacenters, or multiple cloud regions, or if you want
to break a large load-balancing cluster into smaller ones. It is not intended
to replace the first tier as it doesn’t share the same characteristics:
load-balancing is unfair (it is not flow-based) and recovery from a failure is
slow.</div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 14.0pt; margin: 0in;">
<br /></div>
</div>
</div>
</div>
<br /></div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-86994537854964078822020-03-09T02:20:00.066+05:302022-06-22T14:16:28.160+05:30Interview Questions<div dir="ltr" style="text-align: left;" trbidi="on">Counting sort</div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/campus-bikes/">https://leetcode.com/problems/campus-bikes/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/height-checker/">https://leetcode.com/problems/height-checker/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on">DSU<br /><a href="https://leetcode.com/problems/sentence-similarity-ii/">https://leetcode.com/problems/sentence-similarity-ii/</a> (DFS as well)</div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/number-of-islands-ii/">https://leetcode.com/problems/number-of-islands-ii/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/most-stones-removed-with-same-row-or-column/">https://leetcode.com/problems/most-stones-removed-with-same-row-or-column/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on">
Searching in Sorted 2-D array<br />
1. <a href="https://leetcode.com/problems/search-a-2d-matrix-ii/">https://leetcode.com/problems/search-a-2d-matrix-ii/</a><br />
2. <a href="https://leetcode.com/problems/search-a-2d-matrix/">https://leetcode.com/problems/search-a-2d-matrix/</a><br />
3. <a href="https://leetcode.com/problems/kth-smallest-element-in-a-sorted-matrix/">https://leetcode.com/problems/kth-smallest-element-in-a-sorted-matrix/</a><br />
<br />
DAG/Topological sort<br />
4. <a href="https://leetcode.com/problems/course-schedule/">https://leetcode.com/problems/course-schedule/</a><br />
5. <a href="https://leetcode.com/problems/course-schedule-ii">https://leetcode.com/problems/course-schedule-ii</a><br />
6. <a href="https://leetcode.com/problems/alien-dictionary/">https://leetcode.com/problems/alien-dictionary/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/longest-increasing-path-in-a-matrix/">https://leetcode.com/problems/longest-increasing-path-in-a-matrix/</a> (OR DFS with memo/DP)</div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/all-paths-from-source-lead-to-destination/">https://leetcode.com/problems/all-paths-from-source-lead-to-destination/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/network-delay-time/">https://leetcode.com/problems/network-delay-time/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">(diagram gives the impression of DAG, but it's DG => Directed and cyclic, find the shortest path from source to all)</div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/reorder-routes-to-make-all-paths-lead-to-the-city-zero/">https://leetcode.com/problems/reorder-routes-to-make-all-paths-lead-to-the-city-zero/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><br />
Trie or other string match<br />
7. <a href="https://leetcode.com/problems/add-and-search-word-data-structure-design/">https://leetcode.com/problems/add-and-search-word-data-structure-design/</a><br />
8. <a href="https://leetcode.com/problems/wildcard-matching/">https://leetcode.com/problems/wildcard-matching/</a> </div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/stream-of-characters/">https://leetcode.com/problems/stream-of-characters/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/design-search-autocomplete-system/">https://leetcode.com/problems/design-search-autocomplete-system/<br /></a>Similar:<br /><a href="https://leetcode.com/problems/word-ladder/">https://leetcode.com/problems/word-ladder/</a><br /><a href="https://leetcode.com/problems/implement-magic-dictionary/">https://leetcode.com/problems/implement-magic-dictionary/</a><br />
<br />
Heap<br />
9. <a href="https://leetcode.com/problems/top-k-frequent-elements/">https://leetcode.com/problems/top-k-frequent-elements/</a><br />
10. <a href="https://leetcode.com/problems/the-k-weakest-rows-in-a-matrix/">https://leetcode.com/problems/the-k-weakest-rows-in-a-matrix/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/mean-of-array-after-removing-some-elements/">https://leetcode.com/problems/mean-of-array-after-removing-some-elements/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><div dir="ltr" trbidi="on"><span style="color: #0000ee;"><u><a href="https://leetcode.com/problems/minimum-cost-to-hire-k-workers/">https://leetcode.com/problems/minimum-cost-to-hire-k-workers/</a></u></span></div><div><span style="color: #0000ee;"><br /></span></div></div><div dir="ltr" style="text-align: left;" trbidi="on">non heap approach as well</div><div dir="ltr" style="text-align: left;" trbidi="on">
<br />
Graph<br />11. DFS => <a href="https://leetcode.com/problems/accounts-merge/">https://leetcode.com/problems/accounts-merge/</a><br />12. 2D Matrix BFS => <a href="https://leetcode.com/problems/walls-and-gates/">https://leetcode.com/problems/walls-and-gates/</a><br />13. DFS => <a href="https://leetcode.com/problems/is-graph-bipartite">https://leetcode.com/problems/is-graph-bipartite</a><br />14. DFS => <a href="https://leetcode.com/problems/similar-string-groups/">https://leetcode.com/problems/similar-string-groups/</a><br />15. BFS => <a href="https://leetcode.com/problems/word-ladder/">https://leetcode.com/problems/word-ladder/</a><br />16. BFS => <a href="https://leetcode.com/problems/maximum-depth-of-n-ary-tree/">https://leetcode.com/problems/maximum-depth-of-n-ary-tree/</a><br />17. DFS => <a href="https://leetcode.com/problems/delete-nodes-and-return-forest/">https://leetcode.com/problems/delete-nodes-and-return-forest/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/android-unlock-patterns/">https://leetcode.com/problems/android-unlock-patterns/ </a> (And permutation generation)</div><div dir="ltr" style="text-align: left;" trbidi="on">BFS <a href="https://leetcode.com/problems/sliding-puzzle/">https://leetcode.com/problems/sliding-puzzle/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">BFS <a href="https://leetcode.com/problems/evaluate-division/">https://leetcode.com/problems/evaluate-division/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">Modified BFS <a href="https://leetcode.com/problems/shortest-path-in-a-grid-with-obstacles-elimination/">https://leetcode.com/problems/shortest-path-in-a-grid-with-obstacles-elimination/</a><br />DFS => <a href="https://leetcode.com/problems/path-with-maximum-gold/">https://leetcode.com/problems/path-with-maximum-gold/</a> (backtracking)</div><div dir="ltr" style="text-align: left;" trbidi="on">DFS => <a href="https://leetcode.com/problems/minimum-time-to-collect-all-apples-in-a-tree/">https://leetcode.com/problems/minimum-time-to-collect-all-apples-in-a-tree/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/find-root-of-n-ary-tree/">https://leetcode.com/problems/find-root-of-n-ary-tree/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">DFS => <a href="https://leetcode.com/problems/letter-tile-possibilities/">https://leetcode.com/problems/letter-tile-possibilities/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/flower-planting-with-no-adjacent/">
https://leetcode.com/problems/flower-planting-with-no-adjacent/</a><br /><br /></div><div dir="ltr" style="text-align: left;" trbidi="on"><br />
Matrix<br />
18. <a href="https://leetcode.com/problems/range-sum-query-2d-immutable/">https://leetcode.com/problems/range-sum-query-2d-immutable/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">build upon this <a href="https://leetcode.com/problems/matrix-block-sum/">https://leetcode.com/problems/matrix-block-sum/</a><br />
19. <a href="https://leetcode.com/problems/diagonal-traverse/">https://leetcode.com/problems/diagonal-traverse/</a><br />20. BFS => <a href="https://leetcode.com/problems/shortest-path-in-binary-matrix/">https://leetcode.com/problems/shortest-path-in-binary-matrix/</a><br />21. BFS => <a href="https://leetcode.com/problems/swim-in-rising-water">https://leetcode.com/problems/swim-in-rising-water</a><br />22. BFS => <a href="https://leetcode.com/problems/find-positive-integer-solution-for-a-given-equation/">https://leetcode.com/problems/find-positive-integer-solution-for-a-given-equation/</a><br />23. BFS => <a href="https://leetcode.com/problems/shortest-distance-from-all-buildings/">https://leetcode.com/problems/shortest-distance-from-all-buildings/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/find-valid-matrix-given-row-and-column-sums/">https://leetcode.com/problems/find-valid-matrix-given-row-and-column-sums/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/toeplitz-matrix/">https://leetcode.com/problems/toeplitz-matrix/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">
<br />
Linked Lists<br />
24. <a href="https://leetcode.com/problems/next-greater-node-in-linked-list/">https://leetcode.com/problems/next-greater-node-in-linked-list/</a><br />
25. <a href="https://leetcode.com/problems/merge-two-sorted-lists/">https://leetcode.com/problems/merge-two-sorted-lists/</a><br />
26. <a href="https://leetcode.com/problems/reorder-list/">https://leetcode.com/problems/reorder-list/</a><br />
27. <a href="https://leetcode.com/problems/insert-into-a-sorted-circular-linked-list/">https://leetcode.com/problems/insert-into-a-sorted-circular-linked-list/</a><br />
28. <a href="https://leetcode.com/problems/convert-binary-search-tree-to-sorted-doubly-linked-list/">https://leetcode.com/problems/convert-binary-search-tree-to-sorted-doubly-linked-list/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/print-immutable-linked-list-in-reverse/">https://leetcode.com/problems/print-immutable-linked-list-in-reverse/</a><br />
<br />
Stack<br />
29. <a href="https://leetcode.com/problems/simplify-path/">https://leetcode.com/problems/simplify-path/</a><br />30. Parentheses => <a href="https://leetcode.com/problems/valid-parentheses/">https://leetcode.com/problems/valid-parentheses/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"> <a href="https://leetcode.com/problems/brace-expansion/">https://leetcode.com/problems/brace-expansion/</a><br /><a href="https://leetcode.com/problems/remove-all-adjacent-duplicates-in-string/">
https://leetcode.com/problems/remove-all-adjacent-duplicates-in-string/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/remove-outermost-parentheses/">https://leetcode.com/problems/remove-outermost-parentheses/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><br />
Interval merging/intersection/Task Scheduling<br />
31. <a href="https://leetcode.com/problems/task-scheduler/">https://leetcode.com/problems/task-scheduler/</a><br />
<br />
32. <a href="https://leetcode.com/problems/non-overlapping-intervals/">https://leetcode.com/problems/non-overlapping-intervals/</a> - sort by end time ascending, maximize number of tasks finished<br />
33. <a href="https://leetcode.com/problems/minimum-number-of-arrows-to-burst-balloons">https://leetcode.com/problems/minimum-number-of-arrows-to-burst-balloons</a> - same as above, sort by end time<br />
<br />
34. <a href="https://leetcode.com/problems/interval-list-intersections/">https://leetcode.com/problems/interval-list-intersections/</a><br />
35. <a href="https://leetcode.com/problems/merge-intervals/">https://leetcode.com/problems/merge-intervals/</a> - start time ascending sort, keep merging with top of the stack<br />
36. <a href="https://leetcode.com/problems/meeting-rooms/">https://leetcode.com/problems/meeting-rooms/</a><br />
<br />
These 4 problems have exact same solution:<br />37. <a href="https://leetcode.com/problems/meeting-rooms-ii/">https://leetcode.com/problems/meeting-rooms-ii/</a><br />38. <a href="https://leetcode.com/problems/my-calendar-i/">https://leetcode.com/problems/my-calendar-i/</a><br />
39. <a href="https://leetcode.com/problems/my-calendar-ii/">https://leetcode.com/problems/my-calendar-ii/</a><br />
40. <a href="https://leetcode.com/problems/my-calendar-iii/">https://leetcode.com/problems/my-calendar-iii/</a><br />
<br />
41. <a href="https://leetcode.com/problems/data-stream-as-disjoint-intervals/">https://leetcode.com/problems/data-stream-as-disjoint-intervals/</a><br />
<br />
Binary Tree/Binary Search Tree<br /><a href="https://leetcode.com/problems/binary-tree-coloring-game/">
https://leetcode.com/problems/binary-tree-coloring-game/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/all-possible-full-binary-trees/">https://leetcode.com/problems/all-possible-full-binary-trees/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/peak-index-in-a-mountain-array/">https://leetcode.com/problems/peak-index-in-a-mountain-array/<br /></a><a href="https://leetcode.com/problems/find-in-mountain-array/">https://leetcode.com/problems/find-in-mountain-array/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/correct-a-binary-tree/">https://leetcode.com/problems/correct-a-binary-tree/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on">42. Inorder Traversal => <a href="https://leetcode.com/problems/binary-search-tree-iterator/">https://leetcode.com/problems/binary-search-tree-iterator/</a><br />43. Inorder Traversal => <a href="https://leetcode.com/problems/validate-binary-search-tree/">https://leetcode.com/problems/validate-binary-search-tree/</a><br />44. InOrder => <a href="https://leetcode.com/problems/kth-smallest-element-in-a-bst">https://leetcode.com/problems/kth-smallest-element-in-a-bst</a><br />45. InOrder => <a href="https://leetcode.com/problems/increasing-order-search-tree">https://leetcode.com/problems/increasing-order-search-tree</a><br />
<div>46. PostOrder => <a href="https://leetcode.com/problems/serialize-and-deserialize-binary-tree/">https://leetcode.com/problems/serialize-and-deserialize-binary-tree/</a><br />47. PostOrder => <a href="https://leetcode.com/problems/range-sum-of-bst/">https://leetcode.com/problems/range-sum-of-bst/</a><br />48. PostOrder => <a href="https://leetcode.com/problems/flatten-binary-tree-to-linked-list">https://leetcode.com/problems/flatten-binary-tree-to-linked-list</a><br />49. DFS (backtracking) <a href="https://leetcode.com/problems/lowest-common-ancestor-of-a-binary-tree/">https://leetcode.com/problems/lowest-common-ancestor-of-a-binary-tree/</a><br />50. DFS => <a href="https://leetcode.com/problems/lowest-common-ancestor-of-deepest-leaves">https://leetcode.com/problems/lowest-common-ancestor-of-deepest-leaves</a><br />51. DFS <a href="https://leetcode.com/problems/all-nodes-distance-k-in-binary-tree/">https://leetcode.com/problems/all-nodes-distance-k-in-binary-tree/</a><br />52. BFS => <a href="https://leetcode.com/problems/check-completeness-of-a-binary-tree/">https://leetcode.com/problems/check-completeness-of-a-binary-tree/</a><br />53. BFS => <a href="https://leetcode.com/problems/binary-tree-right-side-view/">https://leetcode.com/problems/binary-tree-right-side-view/</a><br />
54. <a href="https://leetcode.com/problems/binary-tree-vertical-order-traversal/">https://leetcode.com/problems/binary-tree-vertical-order-traversal/</a><br />
55. <a href="https://leetcode.com/problems/closest-binary-search-tree-value/">https://leetcode.com/problems/closest-binary-search-tree-value/</a><br />
56. <a href="https://leetcode.com/problems/smallest-subtree-with-all-the-deepest-nodes/">https://leetcode.com/problems/smallest-subtree-with-all-the-deepest-nodes/</a><br />
57. <a href="https://leetcode.com/problems/minimum-depth-of-binary-tree/">https://leetcode.com/problems/minimum-depth-of-binary-tree/</a><br />
58. <a href="https://leetcode.com/problems/binary-tree-longest-consecutive-sequence/">https://leetcode.com/problems/binary-tree-longest-consecutive-sequence/</a><br />
59. <a href="https://leetcode.com/problems/binary-tree-longest-consecutive-sequence-ii/">https://leetcode.com/problems/binary-tree-longest-consecutive-sequence-ii/</a><br />
60. <a href="https://leetcode.com/problems/split-bst/">https://leetcode.com/problems/split-bst/</a><br />
61. <a href="https://leetcode.com/problems/populating-next-right-pointers-in-each-node">https://leetcode.com/problems/populating-next-right-pointers-in-each-node</a><br />
62. <a href="https://leetcode.com/problems/univalued-binary-tree">https://leetcode.com/problems/univalued-binary-tree</a><br />
63. <a href="https://leetcode.com/problems/diameter-of-binary-tree">https://leetcode.com/problems/diameter-of-binary-tree</a></div>
<br />
Binary Search<br />
64. <a href="https://leetcode.com/problems/missing-element-in-sorted-array/">https://leetcode.com/problems/missing-element-in-sorted-array/</a><br />
65. <a href="https://leetcode.com/problems/random-pick-with-weight/">https://leetcode.com/problems/random-pick-with-weight/</a><br />
66. <a href="https://leetcode.com/problems/find-first-and-last-position-of-element-in-sorted-array/">https://leetcode.com/problems/find-first-and-last-position-of-element-in-sorted-array/</a><br />
67. <a href="https://leetcode.com/problems/first-bad-version/">https://leetcode.com/problems/first-bad-version/</a><br />
68. <a href="https://leetcode.com/problems/fixed-point/">https://leetcode.com/problems/fixed-point/</a><br />
69. <a href="https://leetcode.com/problems/count-negative-numbers-in-a-sorted-matrix">https://leetcode.com/problems/count-negative-numbers-in-a-sorted-matrix</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/find-peak-element/">https://leetcode.com/problems/find-peak-element/</a><br />
<br />
<br />
Continuous SubArray/SubString/SubSequence<br />
70. <a href="https://leetcode.com/problems/subarray-sum-equals-k/">https://leetcode.com/problems/subarray-sum-equals-k/</a><br />
71. <a href="https://leetcode.com/problems/continuous-subarray-sum/">https://leetcode.com/problems/continuous-subarray-sum/</a><br />
72. <a href="https://leetcode.com/problems/maximum-size-subarray-sum-equals-k/submissions/">https://leetcode.com/problems/maximum-size-subarray-sum-equals-k/</a><br />
73. <a href="https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/">https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">similar to <a href="https://leetcode.com/problems/fruit-into-baskets/">https://leetcode.com/problems/fruit-into-baskets/</a><br />
74. <a href="https://leetcode.com/problems/find-all-anagrams-in-a-string/solution/">https://leetcode.com/problems/find-all-anagrams-in-a-string/solution/</a><br />
75. <a href="https://leetcode.com/problems/permutation-in-string/">https://leetcode.com/problems/permutation-in-string/</a><br />
76. <a href="https://leetcode.com/problems/max-consecutive-ones-iii/">https://leetcode.com/problems/max-consecutive-ones-iii/</a><br />
77. <a href="https://leetcode.com/problems/longest-consecutive-sequence">https://leetcode.com/problems/longest-consecutive-sequence</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/sliding-window-maximum/">https://leetcode.com/problems/sliding-window-maximum/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/longest-continuous-subarray-with-absolute-diff-less-than-or-equal-to-limit/">https://leetcode.com/problems/longest-continuous-subarray-with-absolute-diff-less-than-or-equal-to-limit/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/shortest-way-to-form-string/</u></span><a href="https://leetcode.com/problems/longest-continuous-subarray-with-absolute-diff-less-than-or-equal-to-limit/"><br /></a>
<br />
DP<br />
78. <a href="https://leetcode.com/problems/partition-equal-subset-sum">https://leetcode.com/problems/partition-equal-subset-sum</a><br />
79. <a href="https://leetcode.com/problems/minimum-cost-for-tickets/">https://leetcode.com/problems/minimum-cost-for-tickets/</a><br />
80. <a href="https://leetcode.com/problems/longest-arithmetic-sequence/">https://leetcode.com/problems/longest-arithmetic-sequence/</a><br />
81. <a href="https://leetcode.com/problems/decode-ways/">https://leetcode.com/problems/decode-ways/</a><br />
82. <a href="https://leetcode.com/problems/minimum-knight-moves/">https://leetcode.com/problems/minimum-knight-moves/</a><br />
83. <a href="https://leetcode.com/problems/target-sum/">https://leetcode.com/problems/target-sum/</a><br />
84. <a href="https://leetcode.com/problems/maximal-square">https://leetcode.com/problems/maximal-square</a><br />
85. <a href="https://leetcode.com/problems/word-break/">https://leetcode.com/problems/word-break/</a><br />
86. <a href="https://leetcode.com/problems/longest-line-of-consecutive-one-in-matrix">https://leetcode.com/problems/longest-line-of-consecutive-one-in-matrix</a><br />
87. <a href="https://leetcode.com/problems/knight-probability-in-chessboard/">https://leetcode.com/problems/knight-probability-in-chessboard/</a><br />
88. <a href="https://leetcode.com/problems/valid-palindrome-iii/">https://leetcode.com/problems/valid-palindrome-iii/</a><br />
89. <a href="https://leetcode.com/problems/longest-palindromic-subsequence">https://leetcode.com/problems/longest-palindromic-subsequence</a><br />
90. <a href="https://leetcode.com/problems/number-of-ways-to-stay-in-the-same-place-after-some-steps/">https://leetcode.com/problems/number-of-ways-to-stay-in-the-same-place-after-some-steps/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">91. <a href="https://leetcode.com/problems/how-many-apples-can-you-put-into-the-basket">https://leetcode.com/problems/how-many-apples-can-you-put-into-the-basket</a><br />
92. <a href="https://leetcode.com/problems/best-time-to-buy-and-sell-stock-iii/">https://leetcode.com/problems/best-time-to-buy-and-sell-stock-iii/</a> (Check i, ii as well)<br />
93. <a href="https://leetcode.com/problems/best-time-to-buy-and-sell-stock-iv">https://leetcode.com/problems/best-time-to-buy-and-sell-stock-iv</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/number-of-dice-rolls-with-target-sum/">https://leetcode.com/problems/number-of-dice-rolls-with-target-sum/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/perfect-squares/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/campus-bikes-ii/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/cherry-pickup-ii/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/minimum-distance-to-type-a-word-using-two-fingers/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/minimum-swaps-to-make-sequences-increasing/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/odd-even-jump/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><u>https://leetcode.com/problems/count-square-submatrices-with-all-ones/</u></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #0000ee;"><span style="color: black;">contrast with <a href="https://leetcode.com/problems/count-submatrices-with-all-ones/">https://leetcode.com/problems/count-submatrices-with-all-ones/</a></span></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on">
<br />
Array/List manipulation<br />
94. <a href="https://leetcode.com/problems/move-zeroes/">https://leetcode.com/problems/move-zeroes/</a><br />
95. <a href="https://leetcode.com/problems/nested-list-weight-sum-ii/">https://leetcode.com/problems/nested-list-weight-sum-ii/</a> - also do (i)<br />
96. <a href="https://leetcode.com/problems/squares-of-a-sorted-array/">https://leetcode.com/problems/squares-of-a-sorted-array/</a><br />
97. <a href="https://leetcode.com/problems/product-of-array-except-self/">https://leetcode.com/problems/product-of-array-except-self/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">similar to <a href="https://leetcode.com/problems/minimum-number-of-operations-to-move-all-balls-to-each-box/">https://leetcode.com/problems/minimum-number-of-operations-to-move-all-balls-to-each-box/</a><br />
98. <a href="https://leetcode.com/problems/maximum-swap/">https://leetcode.com/problems/maximum-swap/</a><br />
99. <a href="https://leetcode.com/problems/intersection-of-three-sorted-arrays/">https://leetcode.com/problems/intersection-of-three-sorted-arrays/</a><br />
100. <a href="https://leetcode.com/problems/remove-duplicates-from-sorted-array">https://leetcode.com/problems/remove-duplicates-from-sorted-array</a><br />
101. <a href="https://leetcode.com/problems/single-row-keyboard/">https://leetcode.com/problems/single-row-keyboard/</a><br />
102. <a href="https://leetcode.com/problems/decompress-run-length-encoded-list/">https://leetcode.com/problems/decompress-run-length-encoded-list/</a><br />
103. <a href="https://leetcode.com/problems/sort-array-by-parity-ii/">https://leetcode.com/problems/sort-array-by-parity-ii/</a><br />
104. <a href="https://leetcode.com/problems/monotonic-array/">https://leetcode.com/problems/monotonic-array/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/wiggle-sort/">https://leetcode.com/problems/wiggle-sort/</a><br />
<br />
Deep Copy<br />
105. <a href="https://leetcode.com/problems/clone-graph/">https://leetcode.com/problems/clone-graph/</a><br />
106. <a href="https://leetcode.com/problems/copy-list-with-random-pointer/">https://leetcode.com/problems/copy-list-with-random-pointer/</a><br />
<br />
Arithmetic operations/Math<br />
107. <a href="https://leetcode.com/problems/multiply-strings/">https://leetcode.com/problems/multiply-strings/</a><br />
108. <a href="https://leetcode.com/problems/divide-two-integers/">https://leetcode.com/problems/divide-two-integers/</a> (Read the Solution section, quite good)<br />
109. <a href="https://leetcode.com/problems/bulb-switcher/">https://leetcode.com/problems/bulb-switcher/</a><br />
110. <a href="https://leetcode.com/problems/fraction-to-recurring-decimal/">https://leetcode.com/problems/fraction-to-recurring-decimal/</a><br />
111. <a href="https://leetcode.com/problems/basic-calculator-ii/">https://leetcode.com/problems/basic-calculator-ii/</a><br />112. Geometry => <a href="https://leetcode.com/problems/minimum-area-rectangle">https://leetcode.com/problems/minimum-area-rectangle</a><br />
113. <a href="https://leetcode.com/problems/largest-number/">https://leetcode.com/problems/largest-number/</a><br />
114. <a href="https://leetcode.com/problems/projection-area-of-3d-shapes/">https://leetcode.com/problems/projection-area-of-3d-shapes/</a><br />
115. <a href="https://leetcode.com/problems/number-of-steps-to-reduce-a-number-to-zero">https://leetcode.com/problems/number-of-steps-to-reduce-a-number-to-zero</a><br />
116. <a href="https://leetcode.com/problems/plus-one/">https://leetcode.com/problems/plus-one/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/add-to-array-form-of-integer/">https://leetcode.com/problems/add-to-array-form-of-integer/<br /></a>
<br />
Generating permutations/combinations<br />
117. <a href="https://leetcode.com/problems/next-permutation/">https://leetcode.com/problems/next-permutation/</a><br />
118. <a href="https://leetcode.com/problems/letter-combinations-of-a-phone-number/submissions/">https://leetcode.com/problems/letter-combinations-of-a-phone-number/</a> (cross product of lists)<br />
119. <a href="https://leetcode.com/problems/subsets">https://leetcode.com/problems/subsets</a> (power set)</div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/di-string-match/">https://leetcode.com/problems/di-string-match/</a><br />
<br />
String manipulation</div><div dir="ltr" style="text-align: left;" trbidi="on">120. <a href="https://leetcode.com/problems/camelcase-matching/">https://leetcode.com/problems/camelcase-matching/<br /></a>121. <a href="https://leetcode.com/problems/goat-latin/">https://leetcode.com/problems/goat-latin/</a><br />
122. <a href="https://leetcode.com/problems/custom-sort-string/">https://leetcode.com/problems/custom-sort-string/</a><br />
123. <a href="https://leetcode.com/problems/add-bold-tag-in-string/">https://leetcode.com/problems/add-bold-tag-in-string/</a><br />124. Parentheses => <a href="https://leetcode.com/problems/minimum-add-to-make-parentheses-valid/">https://leetcode.com/problems/minimum-add-to-make-parentheses-valid/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">125. <a href="https://leetcode.com/problems/longest-chunked-palindrome-decomposition/">https://leetcode.com/problems/longest-chunked-palindrome-decomposition/</a><br />126. Parentheses => <a href="https://leetcode.com/problems/minimum-remove-to-make-valid-parentheses/">https://leetcode.com/problems/minimum-remove-to-make-valid-parentheses/</a><br />
127. <a href="https://leetcode.com/problems/remove-invalid-parentheses/">https://leetcode.com/problems/remove-invalid-parentheses/</a><br />128. Parentheses => <a href="https://leetcode.com/problems/generate-parentheses/">https://leetcode.com/problems/generate-parentheses/</a><br />
129. <a href="https://leetcode.com/problems/reorganize-string/">https://leetcode.com/problems/reorganize-string/</a><br />130. <a href="https://leetcode.com/problems/find-and-replace-in-string/">https://leetcode.com/problems/find-and-replace-in-string/</a><br />
131. <a href="https://leetcode.com/problems/implement-magic-dictionary/">https://leetcode.com/problems/implement-magic-dictionary/</a><br />
132. <a href="https://leetcode.com/problems/group-shifted-strings/">https://leetcode.com/problems/group-shifted-strings/</a><br />
133. <a href="https://leetcode.com/problems/subdomain-visit-count/">https://leetcode.com/problems/subdomain-visit-count/</a><br />
134. <a href="https://leetcode.com/problems/jewels-and-stones/">https://leetcode.com/problems/jewels-and-stones/</a><br />
135. <a href="https://leetcode.com/problems/is-subsequence/">https://leetcode.com/problems/is-subsequence/</a><br />
136. <a href="https://leetcode.com/problems/string-transforms-into-another-string/">https://leetcode.com/problems/string-transforms-into-another-string/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">similar to <a href="https://leetcode.com/problems/isomorphic-strings/">https://leetcode.com/problems/isomorphic-strings/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/decode-string/">https://leetcode.com/problems/decode-string/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/x-of-a-kind-in-a-deck-of-cards/">https://leetcode.com/problems/x-of-a-kind-in-a-deck-of-cards/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/long-pressed-name/">https://leetcode.com/problems/long-pressed-name/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/repeated-substring-pattern/">https://leetcode.com/problems/repeated-substring-pattern/</a><br />
<br />
Others<br />
137. <a href="https://leetcode.com/problems/exam-room/">https://leetcode.com/problems/exam-room/</a></div><div dir="ltr" style="text-align: left;" trbidi="on">start from <a href="https://leetcode.com/problems/maximize-distance-to-closest-person/">https://leetcode.com/problems/maximize-distance-to-closest-person/</a><br />
138. <a href="https://leetcode.com/problems/next-closest-time">https://leetcode.com/problems/next-closest-time</a><br />
139. <a href="https://leetcode.com/problems/validate-ip-address/">https://leetcode.com/problems/validate-ip-address/</a><br />
140. <a href="https://leetcode.com/problems/unique-email-addresses/">https://leetcode.com/problems/unique-email-addresses/</a><br />
141. <a href="https://leetcode.com/problems/flatten-nested-list-iterator/">https://leetcode.com/problems/flatten-nested-list-iterator/</a><br />
142. <a href="https://leetcode.com/problems/logger-rate-limiter/">https://leetcode.com/problems/logger-rate-limiter/</a><br />
143. <a href="https://leetcode.com/problems/random-pick-index/">https://leetcode.com/problems/random-pick-index/</a><br />
144. <a href="https://leetcode.com/problems/find-the-celebrity">https://leetcode.com/problems/find-the-celebrity</a></div><div dir="ltr" style="text-align: left;" trbidi="on">145. <a href="https://leetcode.com/problems/3sum/">https://leetcode.com/problems/3sum/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/sentence-screen-fitting/">https://leetcode.com/problems/sentence-screen-fitting/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/alphabet-board-path/">https://leetcode.com/problems/alphabet-board-path/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/shortest-distance-to-target-color/">https://leetcode.com/problems/shortest-distance-to-target-color/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/positions-of-large-groups/">https://leetcode.com/problems/positions-of-large-groups/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/robot-return-to-origin/">https://leetcode.com/problems/robot-return-to-origin/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/expressive-words/">https://leetcode.com/problems/expressive-words/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/rearrange-spaces-between-words/">https://leetcode.com/problems/rearrange-spaces-between-words/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/confusing-number-ii/">https://leetcode.com/problems/confusing-number-ii/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/bulls-and-cows/">https://leetcode.com/problems/bulls-and-cows/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/design-phone-directory/">https://leetcode.com/problems/design-phone-directory/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/snapshot-array/">https://leetcode.com/problems/snapshot-array/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/longest-absolute-file-path/">https://leetcode.com/problems/longest-absolute-file-path/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/license-key-formatting/">https://leetcode.com/problems/license-key-formatting/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><a href="https://leetcode.com/problems/k-empty-slots/">https://leetcode.com/problems/k-empty-slots/</a></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div><div dir="ltr" style="text-align: left;" trbidi="on"><br /></div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-27123001727779733012020-03-05T01:49:00.001+05:302020-03-05T01:49:56.459+05:30permutation combination generation<div dir="ltr" style="text-align: left;" trbidi="on">
n choose k from array<br />
<a href="https://stackoverflow.com/a/16256122/49560">https://stackoverflow.com/a/16256122/49560</a><br />
<br />
all permutations of a string<br />
<a href="https://stackoverflow.com/a/4240323/49560">https://stackoverflow.com/a/4240323/49560</a><br />
<br />
all permutations of an array<br />
<a href="https://stackoverflow.com/a/14444037/49560">https://stackoverflow.com/a/14444037/49560</a></div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-24407918671385702102019-10-28T20:09:00.000+05:302019-10-28T21:47:29.824+05:30mathematical mindset notes<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
Books by Keith Devlin<br />
A mathematician grappling with his century<br />
Flannery 2002 - how the puzzles with her dad helped her math<br />
<br />
<br />
KENKEN puzzles<br />
Youcubed.org -> number sense/ Fluency without fear/<br />
Mark Driscoll's books, Ruth Parker's mathematics problems, 2 curricula from England 1) Points of Departure 2) SMILE (Secondary mathematics individualized learning experience)<br />
Multilink cubes<br />
Given 36 sides, find shape with largest enclosure area, fencing, how does sin(x) help here. sin(10), circle.<br />
Book ref - What's math got to do with it?<br />
<br />
Don't seem useful: Apps and Games(All are paid seems like) - Motion Math, Number Rack(Math Learning Center), mathbreakers.com, Wuzzit Trouble,<br />
<br /></div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-71721834450275243792019-10-17T23:52:00.001+05:302019-10-17T23:52:42.586+05:30biggest streak of good numbers<div dir="ltr"><div class="gmail_quote">Given a list of bad numbers and lo and hi range, find the biggest streak of good numbers.</div><div class="gmail_quote"><br><div dir="ltr">class Solution {<br> public int driver() {<br> int[] bad = new int[]{1,15,27};<br> int lo = 27;<br> int hi = 30;<br> return bigStreak(bad, lo, hi);<br> }<br> <br> private int bigStreak(int[] bad, int lo, int hi) {<br> int max = 0;<br> int i = 0;<br> for(; i<bad.length-1; i++) {<br> int curr = 0; <br> if (hi < bad[i]) {<br> curr = hi - lo + 1;<br> if(curr > max) max = curr;<br> break;<br> } else if (hi == bad[i]) {<br> curr = hi - lo;<br> if (curr > max) max = curr;<br> break;<br> } else if (hi > bad[i] && hi <= bad[i + 1]) {<br> if (lo < bad[i]) {<br> int s1 = bad[i] - 1 - lo + 1;<br> int s2 = hi - (bad[i] + 1) + 1;<br> if (hi == bad[i + 1]) --s2;<br> curr = Math.max(s1, s2);<br> } else if (lo == bad[i]) {<br> curr = hi - (bad[i] + 1) + 1;<br> if (hi == bad[i + 1]) --curr;<br> } else if (lo > bad[i]) {<br> curr = hi - lo + 1;<br> if (hi == bad[i + 1]) --curr;<br> }<br> if (curr > max) max = curr;<br> break;<br> } else if (hi > bad[i + 1]){<br> if (lo <= bad[i]) {<br> int s1 = bad[i] - lo;<br> int s2 = bad[i + 1] - bad[i] - 1;<br> curr = Math.max(s1, s2);<br> lo = bad[i + 1] + 1;<br> } else if (lo > bad[i] && lo <= bad[i + 1]) {<br> curr = bad[i + 1] - lo;<br> lo = bad[i + 1] + 1;<br> } else if ( lo > bad[i + 1]) {<br> ;<br> }<br> if (curr > max) max = curr;<br> }<br> }<br> <br> if ( i == bad.length - 1) {<br> int curr = hi - lo + 1;<br> if ( curr > max ) max = curr;<br> }<br> return max;<br> }<br>}<br></div> </div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-14559926216272575722019-10-15T01:34:00.000+05:302019-10-15T02:34:28.512+05:30hotstar peak handling how<div dir="ltr" style="text-align: left;" trbidi="on">
1. WC 2019 IND vs NZ Semifinal resulted in 25.3M concurrent users. Usage of around 10Tbps bandwidth which is 70% of India's capacity.<br />
<div>
2. Scale proactively - can't scale in real time. Autoscaling results in insufficient capacity errors at times.</div>
<div>
3. Graceful degradation - turn off less critical services when there is too much traffic. Reduce video quality for users. If auth system is having issues(DB in the backend is slow) - allow users to watch without login.</div>
<div>
4. Peak traffic shift - as soon as Dhoni got out people went to the home screen. Home screen calls a lot of APIs - recommendation/watch history/general content.</div>
<div>
5. Push notifications result in a traffic peaks.</div>
<div>
6. Prewarm infra before match time.<br />
7. No dumb clients - clients should add jitter to requests so as not to overwhelm the backend in case of issues.<br />
8. Reject early - drive away a lot of spurious traffic - don't let it come to your backend.<br />
<br /></div>
<div>
<br /></div>
<div>
<a href="https://www.youtube.com/watch?v=kHRFQ3oebjU&feature=youtu.be">Video</a><br />
<br />
<a href="https://blog.hotstar.com/scaling-is-not-an-accident-895140ac84c0" target="_blank">Blog</a><br />
<br />
<a href="https://engineering.fb.com/ios/under-the-hood-broadcasting-live-video-to-millions/" target="_blank">Facebook live streaming</a> - split input feed into smaller chunks and deliver via hierarchy of cache servers. Don't let too many requests come to the origin server. First request will be a cache miss and rest will be held until the first one comes back.</div>
<div>
<br /></div>
</div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-41318237960268036982019-10-14T02:47:00.001+05:302019-10-14T02:49:39.662+05:30reservoir sampling<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="https://www.quora.com/What-is-an-intuitive-explanation-of-reservoir-sampling">good answer by sumitra narayan</a><br />
<div>
<br /></div>
<div>
<a href="https://en.wikipedia.org/wiki/Reservoir_sampling">Algorithm R from wikipedia</a>:<br />
<br />
(* S has items to sample, R will contain the result *)<br />
ReservoirSample(S[1..n], R[1..k])<br />
// fill the reservoir array<br />
for i = 1 to k<br />
R[i] := S[i]<br />
// replace elements with gradually decreasing probability<br />
for i = k+1 to n<br />
j := random(1, i) // important: inclusive range<br />
if j <= k R[j] := S[i]</div>
</div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com1tag:blogger.com,1999:blog-5434031919110451140.post-5285184096385806882019-10-14T02:18:00.001+05:302019-10-14T02:18:26.055+05:30String search algorithms - Naive vs Rabin Karp vs KMP<div dir="ltr">Naive algorithm is O(n) in average as the probability of a random match is very low and even low for 2 consecutive character matches : 26^2. For 1 million char text and 1000 char word 1.04 million comparisons in average case. Worst case O(m*n) so if the above example 1 billion comparisons.<div><br><div>Rabin karp - <a href="https://osf.io/yxjnp/download">used in plagiarism detection</a>. Uses rolling hash - for e.g. if the text is ABCD, and we want to search a word of size 3 then we first compare the hash of ABC which is a*7^0 + b*7^1 + c*7^2. If it doesn't match the hash of the word then we move forward like this 1. first drop A => prev_hash - a*7^0. 2. Then divide the balance by 7 => (prev_hash - a*7^0)/7. 3. Then add D => (prev_hash - a*7^0)/7 + D*7^2. Worst case O(m*n)</div></div><div><br></div><div>KMP - Knuth Morris Pratt - uses a pre-computed table to decide from where to resume search in case of failure. Worst case O(n).</div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com1tag:blogger.com,1999:blog-5434031919110451140.post-63510728091611652432019-09-28T02:27:00.001+05:302019-09-28T02:27:32.783+05:30Binary search simplified<div dir="ltr"><div style="color:rgb(0,0,0);background-color:rgb(255,255,254);font-family:Consolas,"Courier New",monospace;font-size:14px;line-height:19px;white-space:pre"> <div style="line-height:19px"><div><span style="color:rgb(0,0,255)">class</span> Main {</div><div> <span style="color:rgb(0,0,255)">public</span> <span style="color:rgb(0,0,255)">static</span> <span style="color:rgb(0,0,255)">void</span> main(String[] args) {</div><div> <span style="color:rgb(170,170,170)">//refer <a href="https://medium.com/@elizarov/programming-binary-search-6e999783ba5d">https://medium.com/@elizarov/programming-binary-search-6e999783ba5d</a></span></div><div> <span style="color:rgb(170,170,170)">//refer <a href="https://www.geeksforgeeks.org/variants-of-binary-search/">https://www.geeksforgeeks.org/variants-of-binary-search/</a></span></div><div> <span style="color:rgb(170,170,170)">//refer <a href="https://repl.it/@dharmendratolan/GoodnaturedPertinentIntegrationtesting">https://repl.it/@dharmendratolan/GoodnaturedPertinentIntegrationtesting</a></span></div><div><span style="color:rgb(170,170,170)"><br></span></div><div> <span style="color:rgb(170,170,170)">// java signed and unsigned right shift</span></div><div> <span style="color:rgb(0,0,255)">int</span> test = Integer.MAX_VALUE;</div><div> System.out.println(test);</div><div> <span style="color:rgb(0,0,255)">int</span> t1 = test + <span style="color:rgb(9,136,90)">1</span>;</div><div> System.out.println(t1);</div><div> <span style="color:rgb(0,0,255)">int</span> t2 = (test + test) >>> <span style="color:rgb(9,136,90)">1</span>;</div><div> System.out.println(t2);</div><div> <span style="color:rgb(0,0,255)">int</span> t3 = (test + test) >> <span style="color:rgb(9,136,90)">1</span>;</div><div> System.out.println(t3);</div><br><div> <span style="color:rgb(0,0,255)">int</span>[] a = {<span style="color:rgb(9,136,90)">3</span>,<span style="color:rgb(9,136,90)">3</span>,<span style="color:rgb(9,136,90)">3</span>,<span style="color:rgb(9,136,90)">20</span>,<span style="color:rgb(9,136,90)">20</span>,<span style="color:rgb(9,136,90)">20</span>};</div><div> <span style="color:rgb(0,0,255)">int</span> k = <span style="color:rgb(9,136,90)">3</span>;</div><br><div> k = <span style="color:rgb(9,136,90)">3</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"any("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + any(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">4</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"any("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + any(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">20</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"any("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + any(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">3</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">20</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">19</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">21</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">3</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGt(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">20</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGt(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">19</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"firstGt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + firstGt(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">3</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">20</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">19</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLte("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLte(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">3</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLt(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">20</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLt(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">19</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLt(a,k));</div><br><div> k = <span style="color:rgb(9,136,90)">21</span>;</div><div> System.out.println(<span style="color:rgb(163,21,21)">"lastLt("</span> + k + <span style="color:rgb(163,21,21)">")"</span> + lastLt(a,k));</div><div> }</div><br><div> <span style="color:rgb(0,0,255)">private</span> <span style="color:rgb(0,0,255)">static</span> Integer lastLt(<span style="color:rgb(0,0,255)">int</span>[] a, <span style="color:rgb(0,0,255)">int</span> k) {</div><div> <span style="color:rgb(170,170,170)">// last LT means the element just at the left of k(or virtual k)</span></div><div> <span style="color:rgb(0,0,255)">int</span> l = -<span style="color:rgb(9,136,90)">1</span>;</div><div> <span style="color:rgb(0,0,255)">int</span> r = a.length;</div><br><div> Integer ans = null;</div><div> <span style="color:rgb(0,0,255)">while</span>(r > l + <span style="color:rgb(9,136,90)">1</span>) { <span style="color:rgb(170,170,170)">// r - l will overflow if r = Int.MAX and l = -1</span></div><div> <span style="color:rgb(0,0,255)">int</span> m = (r+l) >>> <span style="color:rgb(9,136,90)">1</span>; <span style="color:rgb(170,170,170)">// r + l will overflow if r = Int.Max and l > 0</span></div><div> <span style="color:rgb(170,170,170)">//System.out.println("in " + l +" " + r +" " +m);</span></div><div> <span style="color:rgb(0,0,255)">if</span>(a[m] < k) {</div><div> l = m;</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> r = m;</div><div> }</div><div> <span style="color:rgb(170,170,170)">//System.out.println("out " + l +" " + r +" " +m);</span></div><div> }</div><div> <span style="color:rgb(0,0,255)">return</span> l;</div><div> }</div><br><div> <span style="color:rgb(0,0,255)">private</span> <span style="color:rgb(0,0,255)">static</span> Integer lastLte(<span style="color:rgb(0,0,255)">int</span>[] a, <span style="color:rgb(0,0,255)">int</span> k) {</div><div> <span style="color:rgb(170,170,170)">// last LTE means (the last k) or (element just left of virtual k)</span></div><div> <span style="color:rgb(0,0,255)">int</span> l = -<span style="color:rgb(9,136,90)">1</span>;</div><div> <span style="color:rgb(0,0,255)">int</span> r = a.length;</div><br><div> Integer ans = null;</div><div> <span style="color:rgb(0,0,255)">while</span>(r > l + <span style="color:rgb(9,136,90)">1</span>) { <span style="color:rgb(170,170,170)">// r - l will overflow if r = Int.MAX and l = -1</span></div><div> <span style="color:rgb(0,0,255)">int</span> m = (r+l) >>> <span style="color:rgb(9,136,90)">1</span>; <span style="color:rgb(170,170,170)">// r + l will overflow if r = Int.Max and l > 0</span></div><div> <span style="color:rgb(170,170,170)">//System.out.println("in " + l +" " + r +" " +m);</span></div><div> <span style="color:rgb(0,0,255)">if</span>(a[m] <= k) {</div><div> l = m;</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> r = m;</div><div> }</div><div> <span style="color:rgb(170,170,170)">//System.out.println("out " + l +" " + r +" " +m);</span></div><div> }</div><div> <span style="color:rgb(0,0,255)">return</span> l;</div><div> }</div><br><div> <span style="color:rgb(0,0,255)">private</span> <span style="color:rgb(0,0,255)">static</span> Integer firstGt(<span style="color:rgb(0,0,255)">int</span>[] a, <span style="color:rgb(0,0,255)">int</span> k) {</div><div> <span style="color:rgb(170,170,170)">//firstGt means the element just right of k (or virtual k)</span></div><div> <span style="color:rgb(0,0,255)">int</span> l = -<span style="color:rgb(9,136,90)">1</span>;</div><div> <span style="color:rgb(0,0,255)">int</span> r = a.length;</div><br><div> Integer ans = null;</div><div> <span style="color:rgb(0,0,255)">while</span>(r > l + <span style="color:rgb(9,136,90)">1</span>) { <span style="color:rgb(170,170,170)">// r - l will overflow if r = Int.MAX and l = -1</span></div><div> <span style="color:rgb(0,0,255)">int</span> m = (r+l) >>> <span style="color:rgb(9,136,90)">1</span>; <span style="color:rgb(170,170,170)">// r + l will overflow if r = Int.Max and l > 0</span></div><div> <span style="color:rgb(170,170,170)">//System.out.println("in " + l +" " + r +" " +m);</span></div><div> <span style="color:rgb(0,0,255)">if</span>(a[m] > k) {</div><div> r = m;</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> l = m;</div><div> }</div><div> <span style="color:rgb(170,170,170)">//System.out.println("out " + l +" " + r +" " +m);</span></div><div> }</div><div> <span style="color:rgb(0,0,255)">return</span> r;</div><div> }</div><br><div> <span style="color:rgb(0,0,255)">private</span> <span style="color:rgb(0,0,255)">static</span> Integer firstGte(<span style="color:rgb(0,0,255)">int</span>[] a, <span style="color:rgb(0,0,255)">int</span> k) {</div><div> <span style="color:rgb(170,170,170)">//firstGte means (the first k) or (the element just right of virtual k)</span></div><div> <span style="color:rgb(0,0,255)">int</span> l = -<span style="color:rgb(9,136,90)">1</span>;</div><div> <span style="color:rgb(0,0,255)">int</span> r = a.length;</div><br><div> <span style="color:rgb(170,170,170)">//exact match</span></div><div> Integer ans = null;</div><div> <span style="color:rgb(0,0,255)">while</span>(r > l + <span style="color:rgb(9,136,90)">1</span>) { <span style="color:rgb(170,170,170)">// r - l will overflow if r = Int.MAX and l = -1</span></div><div> <span style="color:rgb(0,0,255)">int</span> m = (r+l) >>> <span style="color:rgb(9,136,90)">1</span>; <span style="color:rgb(170,170,170)">// r + l will overflow if r = Int.Max and l > 0</span></div><div> <span style="color:rgb(170,170,170)">//System.out.println("in " + l +" " + r +" " +m);</span></div><div> <span style="color:rgb(0,0,255)">if</span>(a[m] >= k) {</div><div> r = m;</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> l = m;</div><div> }</div><div> <span style="color:rgb(170,170,170)">//System.out.println("out " + l +" " + r +" " +m);</span></div><div> }</div><div> <span style="color:rgb(0,0,255)">return</span> r;</div><div> }</div><br><div> <span style="color:rgb(0,0,255)">private</span> <span style="color:rgb(0,0,255)">static</span> Integer any(<span style="color:rgb(0,0,255)">int</span>[] a, <span style="color:rgb(0,0,255)">int</span> k) {</div><div> <span style="color:rgb(0,0,255)">int</span> l = -<span style="color:rgb(9,136,90)">1</span>;</div><div> <span style="color:rgb(0,0,255)">int</span> r = a.length;</div><br><div> <span style="color:rgb(170,170,170)">//exact match</span></div><div> Integer ans = null;</div><div> <span style="color:rgb(0,0,255)">while</span>(r > l + <span style="color:rgb(9,136,90)">1</span>) { <span style="color:rgb(170,170,170)">// r - l will overflow if r = Int.MAX and l = -1</span></div><div> <span style="color:rgb(0,0,255)">int</span> m = (r+l) >>> <span style="color:rgb(9,136,90)">1</span>; <span style="color:rgb(170,170,170)">// r + l will overflow if r = Int.Max and l > 0</span></div><div> <span style="color:rgb(170,170,170)">//System.out.println("in " + l +" " + r +" " +m);</span></div><div> <span style="color:rgb(0,0,255)">if</span>(a[m] == k) {</div><div> ans = m; <span style="color:rgb(0,0,255)">break</span>;</div><div> }</div><div> <span style="color:rgb(0,0,255)">else</span> <span style="color:rgb(0,0,255)">if</span>(a[m] > k) {</div><div> r = m;</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> l = m;</div><div> }</div><div> <span style="color:rgb(170,170,170)">//System.out.println("out " + l +" " + r +" " +m);</span></div><div> }</div><div> <span style="color:rgb(0,0,255)">return</span> ans;</div><div> }</div><div>}</div></div> </div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-52804377156658585462019-09-19T17:26:00.001+05:302019-09-19T17:26:28.902+05:30MapReduce paper Notes<div dir="ltr">Paper is <a href="https://pdos.csail.mit.edu/6.824/papers/mapreduce.pdf">here</a>.<div><br></div><div>MR programming model - <b>Map</b> - written by the user - takes an input key/value pair (file_name/file_contents) and emits intermediate pairs like (word, 1) for Wordcount. MR library combines all intermediate pairs and gives them to a reduce task which gives the final output for that intermediate key.</div><div><br></div><div>Handling master failure - just abort the MR task since there is only one master and hence failure is unlikely. Other option is to periodically save state of the master and pick up from there.<br><br>Locality - to preserve bandwidth, the MR master tries to schedule map task on a machine having the copy of the data or near to a replica of the data.</div><div><br></div><div>Backup tasks for Stragglers - When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks. The task is marked as completed whenever either the primary or the backup execution completes.</div><div><br></div><div>Custom partitioning - default is hash(key) mod R, at times it's not enough. For e.g. if it's web links and we want all links with same hostname in same output, then hash(hostname(key)) mod R is right hashing approach.</div><div><br></div><div>Combiner function - after the Map task is executed a big payload is sent over the network to the reducer. To reduce the size we can combine the intermediate keys which come out from the map. Reduce and Combine will most likely have the same code - just that output of Reduce will be written to an output file whereas output of the Combiner will be sent over the network to Reduce.</div><div><br></div><div>Debugging - there is a way to sequentially execute the MR tasks to debug them better.</div><div><br></div><div>Counters - there are ways to keep counts of specific things. They are sent from worker machines to master (piggybacking on the ping response).</div><div><br></div><div><a href="https://stackoverflow.com/a/1152903/49560">Implementing Sort</a> - TeraSort is a standard map/reduce sort, except for a custom partitioner that uses a sorted list of N − 1 sampled keys that define the key range for each reduce. In particular, all keys such that sample[i − 1] <= key < sample[i] are sent to reduce i. This guarantees that the output of reduce i are all less than the output of reduce i+1.</div><div><br></div><div>Some numbers - Cluster of 1800 machines with each one having two 2GHz Intel Xeon processors with HyperThreading enabled, 4GB of memory, two 160GB IDE disks and a gigabit Ethernet link.</div><div>Grep which scanned through 10^10 100 bytes records (1TB in total) took 150 seconds.</div><div>Sorting was around 1000 seconds.</div><div><br></div><div>Advantages - </div><div>Code to deal with fault tolerance/distribution and parallelization is removed from the core task code.</div><div>Conceptually simpler.</div><div>MapReduce exploits a restricted programming model to parallelize the user program automatically and to provide transparent fault-tolerance.<br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-54267912882262368342019-09-05T01:57:00.001+05:302019-09-05T01:57:59.188+05:30CAP theorem<div dir="ltr"><a href="http://ksat.me/a-plain-english-introduction-to-cap-theorem/">http://ksat.me/a-plain-english-introduction-to-cap-theorem/</a> <div><br></div><div>CA => Consistent and available</div><div><br></div><div>C => same state</div><div>A => every request should receive a response</div><div>P => any number of messages in intercom can be lost</div><div><br></div><div>If</div><div>C => same state</div><div></div><div>AND<br></div><div>A => every request should receive a response, so every write request should be Success or Failure</div><div></div><div>THEN<br></div><div>!P => since you can't lose intercom messages</div><div><br></div><div>CP</div><div>If</div><div><div>C => same state</div><div></div><div>AND<br></div><div>P => any number of messages in intercom can be lost</div><div></div><div>THEN<br></div><div>!A => can't respond to write requests since there is no assurance of being C as the messages are lost</div><div><br></div><div>AP</div><div><div><div>A=> take write requests and respond</div><div></div><div>AND<br></div></div><div>P => any number of messages in intercom can be lost</div><div></div><div></div><div>THEN<br></div><div>!C => in absence of communication they will be in different states.</div><div><br></div><div><a href="http://blog.thislongrun.com/2015/04/the-unclear-cp-vs-ca-case-in-cap.html">http://blog.thislongrun.com/2015/04/the-unclear-cp-vs-ca-case-in-cap.html</a> <br></div><div><br></div><div></div></div><div></div></div></div> oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0tag:blogger.com,1999:blog-5434031919110451140.post-7153295019858499222019-08-07T16:34:00.003+05:302019-08-08T00:33:08.836+05:30trinity piano grade 1 enchanted garden notes<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-IGute26B344/XUqwD6bNP-I/AAAAAAACKU0/EHYTfwOCDcAfBP7LY5fAuFbmbJIvjH0UgCKgBGAs/s1600/IMG_20190729_141329.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" src="https://1.bp.blogspot.com/-IGute26B344/XUqwD6bNP-I/AAAAAAACKU0/EHYTfwOCDcAfBP7LY5fAuFbmbJIvjH0UgCKgBGAs/s1600/IMG_20190729_141329.jpg" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Enchanted garden notes:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Bar 1-8: </span><a href="https://www.youtube.com/watch?v=NUKosaWmJVc&list=LLKd4SDfv1ujxDFLpKOh4fIw" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre;">https://www.youtube.com/watch?v=NUKosaWmJVc&list=LLKd4SDfv1ujxDFLpKOh4fIw</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Notice treble clef for both the hands.</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Right hand:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">AD AD BBCB AD AD AD BBCB A</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Left hand:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">GF GF GE GF GF GF GE GF</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">D_ D_ D_ D_ D_ D_ D_ D_</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Bar 9-12 make up the phrase: (notice left hand switching to bass clef)</span></div>
<span id="docs-internal-guid-76c9ef0b-7fff-35f8-3c32-2eb7b7c78590"></span><br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<a href="https://www.youtube.com/watch?v=TQa8BOJUuuw&list=LLKd4SDfv1ujxDFLpKOh4fIw" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre;">https://www.youtube.com/watch?v=TQa8BOJUuuw&list=LLKd4SDfv1ujxDFLpKOh4fIw</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Right hand:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">A_CFG_AGA_CFG_AGF_ (finger 3 on A and ends with 1st finger on F)</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Left hand:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">F C R | Bf F R | F C R | Bf F R (play with fingers 1,5 or 2,5 based on hand size)</span></div>
</div>
oneandonlyhttp://www.blogger.com/profile/17637779233334120042noreply@blogger.com0