Thursday, June 8, 2017

Security training - All 18 attack vectors

Command Injection
  1. If you are running System commands - don't take user input blindly which would be used in commands
  2. Apply filtering, rules.
  3. For e.g. user can provide ';id' as part of the input which will expose current user details.
  4. Simple fix - put a regex filter to ensure what all is allowed.
SQL Injection
  1. Don't trust user input blindly.
  2. Escape or rather use parametrized queries.
Session FixationIf you take user supplied session identifier, another user could use that to take the identity of someone else.

Use of insufficiently random values - for e.g. to generate session ids

Reflected XSS (Cross side scripting)

If you print user provided input as it is :
For e.g.

https://tradesearch.codebashing.com/projects?search=%3Cscript%3Ealert('You got hacked')%3C%2Fscript%3E

So, escape user input before printing.

Remedy in Java
In JSP - use <c:out>

Persistent XSSIf you allow the user to submit values to database which will be later printed on a web page. For e.g. if I am viewing someone else's profile, and that person has put javascript in his name - that javascript will execute in my session and send out my cookie information somewhere.

Remedy in Java
Again use <c:out> in JSP

Directory traversal
If you allow user to supply filepath as a request parameter.

For e.g. https://traderesearch.codebashing.com/article/show_file?file=../etc/passwd
Remedy
Don't allow access in this manner. Rather provide ids for files. And manage user access to those ids.
Else

public void doGet(HttpServletRequest request, HttpServletResponse response) {

  String result;

  String filename = request.getParameter("file");


  try {

     File file = new File("/tmp/" + filename);
String canonicalPath = file.getCanonicalPath();
if(!canonicalPath.startsWith("/tmp/")) {
  throw new GenericException("Unauthorized access");
}
BufferedReader reader = new BufferedReader(new FileReader(file));
     String line = null;
     while((line = reader.readLine()) != null) {
        result = line;
     }

  } catch (Exception e) {

     e.printStackTrace();
  }

  try {

Privileged interface exposure

For e.g. remove your admin interfaces from google search results.
If they ain't supposed to be exposed outside, don't.

Leftover Debug code

For e.g. you can leave a comment in JS code that add debug =1 to URL. Remove such comments.

Authentication credentials in URL - logs etc.


Session exposure within URL  - 


User Enumeration - for e.g. forgot password interface, don't show whether successful or not.


Horizontal privilege escalation- for e.g. just by changing user id I can access someone else's edit profile page.


Vertical privilege escalation  - For e.g. if I just replace "user" with "admin" I get to another page with more privileges - https://tradesocial.codebashing.com/trade_social/admin/show


Insecure redirects - for e.g. yourserver.com/redirect?url=someotherurl . So your server is responsible for redirects. Don't redirect to urls which you don't trust.


Click Jacking - Display an iframe of my stock broking website with opacity 0 in an iframe and place "Research report" button on the malicious website such that clicking that button actually clicks a button beneath in the iframe and sells my stocks.


Remedy :


To defend against Click Jacking attacks the web server should be configured to send X-Frame-Options in the response headers. 


Configuring the X-Frame-Options to either DENY, SAMEORIGIN, or ALLOW-FROM will prevent malicious websites from embedding your application's content. 

I.e. it will prevent the browser from loading your application content within <frame> or <iframe> tags of other websites. 

CSRF  - Cross site request forgery (GET) - if I am using GET requests to perform actions like delete/update etc., someone can embed those URLs in a webpage as img src. If I open that web page, using my cookies/session that URL will perform updates on server.

CSRF (POST) - A malicious website could include a form with your POST URL. And if someone clicks on submit, it would use your cookies etc. to execute that action. Or even without clicking on submit, form could be submitted programatically.
Remedy - use Anti CSRF token, synchronizer token, challange token etc.


XML External Entity Injection

  1. If users are allowed to upload XML files which your server is going to parse, you are at risk.
  2. An XXE attack works by taking advantage of a feature in XML, namely XML eXternal entities (XXE) that allows external XML resources to be loaded within an XML document.
  3. By submitting an XML file that defines an external entity with a file:// URI, an attacker can effectively trick the application's SAX parser into reading the contents of arbitrary file(s) that reside on the server-side filesystem.

Example:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE foo [<!ELEMENT foo ANY >
<!ENTITY bar SYSTEM "file:///etc/passwd" >]>

<trades>
    <metadata>
        <name>Apple Inc</name>
        <stock>APPL</stock>
        <trader>
   
<foo>&bar;</foo>
</trades>

  1. Java web applications using XML libraries are particularly vulnerable to external entity (XXE) injection attacks because the default settings for most Java XML SAX parsers is to have XXE enabled by default.
  2. To use these parsers safely, you have to explicitly disable referencing of external entities in the SAX parser implementation you use. We'll revisit this in the remediation section later.
  3. Problem code:
public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
        } catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

Remedy:
  1. Because user supplied XML input comes from an "untrusted source" it is very difficult to properly validate the XML document in a manner to prevent against this type of attack.
  2. Instead the XML processor should be configured to use only a locally defined Document Type Definition (DTD) and disallow any inline DTD that is specified within user supplied XML document(s).
  3. Due to the fact that there are numerous XML parsing engines available for Java, each has its own mechanism for disabling inline DTD to prevent XXE. You may need to search your XML parser's documentation for how to "disable inline DTD" specifically.

For example with Java XMLInputFactory:
xmlInputFactory.setProperty(
XMLInputFactory.SUPPORT_DTD, false
);

Let's see how the above fix can be applied to our vulnerable example to remediate the XXE vulnerability.

Improved code :

public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
documentBuilderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", false);    
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
} catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

DOM XSS (Cross side scripting)

1. Landing page URL : https://tradenews.codebashing.com/guests/landing#guest
2. Pseudocode for Landing JSP web page

 <h6>
<script>
   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest
   document.write("Hello " + name + "! Please login or signup to access news stories");
</script>
</h6>


Attack
Part 1 : Attacker sends email like this :

Hi Alice, Hope you are well ! Please find below a 20% discount code to join TradeNEWS.

https://tradenews.codebashing.com/guests/landing#<script>window.location = 'https://fake-tradenews.codebashing.com';</script> Kind Regards, Bob



Note the username is replaced with javascript code. Once user clicks on the link, he/she is redirected to a fake website. There credit card information can be stolen.


RemedyTo defend against Cross Site Scripting attacks within the application user's Browser Document Object Model (DOM) environment a defense-in-depth approach is required, combining a number of security best practices.

Note You should recall that for Stored XSS and Reflected XSS injection takes place server side, rather than client browser side. Whereas with DOM XSS, the attack is injected into the Browser DOM, this adds additional complexity and makes it very difficult to prevent and highly context specific, because an attacker can inject HTML, HTML Attributes, CSS as well as URLs.

As a general set of principles the application should first HTML encode and then Javascript encode any user supplied data that is returned to the client. For example using OWASP ESAPI:



document.write(<%=Encoder.encodeForJS(Encoder.encodeForHTML(userSuppliedData))%>);



Due to the very large attack surface this approach is no silver bullet, and as such developers are strongly encouraged to review areas of code that are potentially susceptible to DOM XSS, including but not limited to:



window.name
document.referrer
document.URL
document.documentURIlocation
location.href
location.search
location.hash
eval
setTimeout
setInterval
document.write
document.writeIn


Note: OWASP ESAPI (The OWASP Enterprise Security API) is a free, open source, web application security control library that makes it easier for programmers to write lower-risk applications.

Let's apply a suitable Regex pattern to remediate this particular DOM XSS vulnerability.

<h6>

<script>



   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest



if (name.match(/^[a-zA-Z0-9]*$/))

{

   document.write("Hello " + name + "! Please login or signup to access news stories");

}

else

{

window.alert("Security error");

}



</script>

security training - XXE injection - Attack vector 2

XML External Entity Injection

  1. If users are allowed to upload XML files which your server is going to parse, you are at risk.
  2. An XXE attack works by taking advantage of a feature in XML, namely XML eXternal entities (XXE) that allows external XML resources to be loaded within an XML document.
  3. By submitting an XML file that defines an external entity with a file:// URI, an attacker can effectively trick the application's SAX parser into reading the contents of arbitrary file(s) that reside on the server-side filesystem.

Example:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE foo [<!ELEMENT foo ANY >
<!ENTITY bar SYSTEM "file:///etc/passwd" >]>

<trades>
    <metadata>
        <name>Apple Inc</name>
        <stock>APPL</stock>
        <trader>
     
<foo>&bar;</foo>
</trades>

  1. Java web applications using XML libraries are particularly vulnerable to external entity (XXE) injection attacks because the default settings for most Java XML SAX parsers is to have XXE enabled by default.
  2. To use these parsers safely, you have to explicitly disable referencing of external entities in the SAX parser implementation you use. We'll revisit this in the remediation section later.
  3. Problem code:
public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
        } catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

Remedy:
  1. Because user supplied XML input comes from an "untrusted source" it is very difficult to properly validate the XML document in a manner to prevent against this type of attack.
  2. Instead the XML processor should be configured to use only a locally defined Document Type Definition (DTD) and disallow any inline DTD that is specified within user supplied XML document(s).
  3. Due to the fact that there are numerous XML parsing engines available for Java, each has its own mechanism for disabling inline DTD to prevent XXE. You may need to search your XML parser's documentation for how to "disable inline DTD" specifically.

For example with Java XMLInputFactory:
xmlInputFactory.setProperty(
XMLInputFactory.SUPPORT_DTD, false
);

Let's see how the above fix can be applied to our vulnerable example to remediate the XXE vulnerability.

Improved code :

public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
documentBuilderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", false);      
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
} catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

security training - DOM XSS - Attack vector 1

DOM XSS (Cross side scripting)

1. Landing page URL : https://tradenews.codebashing.com/guests/landing#guest
2. Pseudocode for Landing JSP web page

 <h6>
<script>
   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest
   document.write("Hello " + name + "! Please login or signup to access news stories");
</script>
</h6>


Attack
Part 1 : Attacker sends email like this :

Hi Alice, Hope you are well ! Please find below a 20% discount code to join TradeNEWS.

https://tradenews.codebashing.com/guests/landing#<script>window.location = 'https://fake-tradenews.codebashing.com';</script> Kind Regards, Bob



Note the username is replaced with javascript code. Once user clicks on the link, he/she is redirected to a fake website. There credit card information can be stolen.


RemedyTo defend against Cross Site Scripting attacks within the application user's Browser Document Object Model (DOM) environment a defense-in-depth approach is required, combining a number of security best practices.

Note You should recall that for Stored XSS and Reflected XSS injection takes place server side, rather than client browser side. Whereas with DOM XSS, the attack is injected into the Browser DOM, this adds additional complexity and makes it very difficult to prevent and highly context specific, because an attacker can inject HTML, HTML Attributes, CSS as well as URLs.

As a general set of principles the application should first HTML encode and then Javascript encode any user supplied data that is returned to the client. For example using OWASP ESAPI:



document.write(<%=Encoder.encodeForJS(Encoder.encodeForHTML(userSuppliedData))%>);



Due to the very large attack surface this approach is no silver bullet, and as such developers are strongly encouraged to review areas of code that are potentially susceptible to DOM XSS, including but not limited to:



window.name
document.referrer
document.URL
document.documentURIlocation
location.href
location.search
location.hash
eval
setTimeout
setInterval
document.write
document.writeIn


Note: OWASP ESAPI (The OWASP Enterprise Security API) is a free, open source, web application security control library that makes it easier for programmers to write lower-risk applications.

Let's apply a suitable Regex pattern to remediate this particular DOM XSS vulnerability.

<h6>

<script>



   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest



if (name.match(/^[a-zA-Z0-9]*$/))

{

   document.write("Hello " + name + "! Please login or signup to access news stories");

}

else

{

window.alert("Security error");

}



</script>

Wednesday, June 7, 2017

Tuesday, May 30, 2017

Java - escape double quotes and slashes in html

class Main {
  public static void main(String[] args) {
    String str = "<div name=\"hey\">content</div>";
System.out.println(str);
    
str = str.replace("\"", "\\\"") ;   
str = str.replace("/", "\\/") ;   

System.out.println(str);
  }
}

Thursday, May 25, 2017

Android - redirect user to play store for installing app

Intent viewIntent =
new Intent("android.intent.action.VIEW",
Uri.parse("https://play.google.com/store/apps/details?id=com.google.android.apps.ondemand.consumer"));
_mainActivity.startActivity(viewIntent);

Android - uninstall app programmatically

private void uninstallPackage(String name) {
Intent intent = new Intent(Intent.ACTION_DELETE);
intent.putExtra(Intent.EXTRA_RETURN_RESULT, true);
intent.setData(Uri.parse("package:"+name));
startActivity(intent);
}

private boolean doesPackageExist(String name){
PackageManager pm=getPackageManager();
try {
PackageInfo info=pm.getPackageInfo(name,PackageManager.GET_META_DATA);
} catch (PackageManager.NameNotFoundException e) {
return false;
}
return true;
}

private void checkUninstallStatus(int requestCode, int resultCode, Intent data) {
if (requestCode == 1) {
if (resultCode == RESULT_OK) {
Log.d("TAG", "onActivityResult: user accepted the (un)install");
} else if (resultCode == RESULT_CANCELED) {
Log.d("TAG", "onActivityResult: user canceled the (un)install");
} else if (resultCode == RESULT_FIRST_USER) {
Log.d("TAG", "onActivityResult: failed to (un)install");
}
}
}

@Override
protected void onActivityResult(final int requestCode, final int resultCode, final Intent data) {
checkUninstallStatus(requestCode,resultCode,data);
}

Tuesday, May 23, 2017

Probabilistic algorithms

https://www.safaribooksonline.com/oriole/probabilistic-data-structures-in-python

Set Cardinality - HyperLogLog
Set Membership - With False positives but no False negatives - Bloom Filter
Set similarity (document similarity) - MinHash with Jaccard
Frequency Summaries (Leaderboards in games) - Count-min sketch
Streaming Quantiles - stream of data where you have to aggregate metrics without stopping - T-Digest (like min/max/percentile)



Friday, May 19, 2017

Java print current stack trace

StringWriter sw = new StringWriter();
new Throwable().printStackTrace(new PrintWriter(sw));

//And to actually print it
Log.e("tag","Current stack trace is:\n" + sw.toString());

Friday, May 12, 2017

Tuesday, April 25, 2017

Java - iterate over a UTF-8 string

Either read the file as UTF-8 or convert it later.


public static void main(String[] args) throws Exception {

String path = "D:\\test.txt";
FileInputStream stream = new FileInputStream(new File(path));
BufferedReader br = new BufferedReader(new InputStreamReader(stream,"UTF-8"));
String str;
while ((str = br.readLine()) != null) {
System.out.println(str.length());
for(int i = 0; i< str.length(); ++i) {
System.out.println(String.format("%04x",(int)str.charAt(i)));
}
}
}

OR

BufferedReader br = new BufferedReader(new InputStreamReader(stream));
String str;
while ((str = br.readLine()) != null) {
byte[] ptext = str.getBytes(ISO_8859_1);
str = new String(ptext, UTF_8);
System.out.println(str.length());
for(int i = 0; i< str.length(); ++i) {
System.out.println(String.format("%04x",(int)str.charAt(i)));
}
}

Java get unicode point of a character

class Main {
  public static void main(String[] args) {
char ch = 'ö';
System.out.println(getUnicodePoint(ch));
ch = 'म';
System.out.println(getUnicodePoint(ch));



  }

  private static String getUnicodePoint(char ch) {
      String hex = String.format("%04x", (int) ch);
       return hex;
  }
}

Tuesday, April 11, 2017

outlook not starting


Start Outlook in safe mode to fix "Processing" screen

If Outlook stops responding at a screen that says "Processing," you can close Outlook, start it in safe mode, then close it and open it normally to fix the problem.

Close Outlook.

Launch Outlook in safe mode by choosing one of the following options.

In Windows 10, choose Start, type Outlook.exe /safe, and press Enter.

In Windows 7, choose Start, and in the Search programs and files box, type Outlook /safe, and then press Enter.

In Windows 8, on the Apps menu, choose Run, type Outlook /safe, and then choose OK.

Close Outlook, and then open it normally.

Monday, March 13, 2017

Andrew Ng machine learning

Gradient Descent vs Normal Equation
Normal equation good for smaller feature size

Normal Equation Noninvertibility
too many features (m <= n)
redundant features (linearly dependent)
pinv vs inv in Octave.
Normal equation - regularization makes X'X invertible even if it's not.

Sunday, March 5, 2017

Andrew ng clustering & PCA

randomly assign clusters
assign clusters to each instance
re compute clusters
example image compression - choose R,G,B as numerical features and
assign clusters to each point

PCA
----------
example - data compression - reduce n-dimensions to K-dimensions
co variance matrix to capture non axis aligned features' variance(spread)
reconstruct original data by same matrix - U
eigen vectors
example - image compression - choose each pixel as feature - select K
most important ones

scatter3 in octave for 3D-visualization

andrew ng collaborative filtering

recommender systems - collaborative filtering

If you know weights for movie attributes, romance, action etc, you can
learn weights for user preferences.

If you know weights for user preferences, you can compute movie attributes.

If you don't know both, start with a guess for one and compute other.
then reverse. then reverse. until it converges.

But there is another efficient approach which can solve for both together.

Thursday, March 2, 2017

mysql : database size query

SELECT table_schema "DB Name", Round(Sum(data_length + index_length) /
1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP
BY table_schema;

http://stackoverflow.com/questions/1733507/how-to-get-size-of-mysql-database

table size query

SELECT TABLE_NAME, table_rows, data_length, index_length, round(((data_length + index_length) / 1024 / 1024),2) "Size in MB" FROM information_schema.TABLES WHERE table_schema = "schema_name"

Wednesday, March 1, 2017

Octave indexing

>> tk
tk =

1 2 3
4 5 6
7 8 9

>> tk(1,:) //first row, all columns
ans =

1 2 3

>> tk(:,1) //first column, all rows
ans =

1
4
7

>> tk(1,1) //first row, first column
ans = 1
>> tk(1,2:3) //first row, second/third columns
ans =

2 3

>> tk(2:3,1) //2nd,3rd row, 1st column
ans =

4
7

>> tk(2:3) //If you omit column part, it's 1 by default. Also the result is in row format as opposed to the previous one
ans =

4 7
//flattening in row/column format
>> tk(:)
ans =

1
4
7
2
5
8
3
6
9

>> tk(:)'
ans =

1 4 7 2 5 8 3 6 9

Monday, February 27, 2017

Andrew Ng course

Online learning doesn't require learning rate configuration?

ceiling analysis - machine learning pipeline etc

Anomaly detection - Andrew Ng

Anomaly detection vs Supervised learning - when negative examples are
too few go for anamoly detection

Anamoly detection - choosing features - features should have Normal
distribution. Plot histogram and see. If not, try log(x), log(x+c),
x^0.5, x^0.2 etc. Try combination of features : CPU/Net traffic,
CPU^2/Network traffic etc

Multivariate Normal distribution - let's say memory is unusually high
for a given cpu load. But both of them individually have good enough
probability of occurring. But they are at different sides of their
respective bell curves. So we would go for multivariate Normal
distribution.

Each feature modelled independently as gaussian and multiplied is same
as multivariate Gaussian when axes are aligned, i.e. all off diagonal
components are zero.

Multivariate captures correlations between features automatically.
Otherwise you have to create those unusual features manually.

But the original model is computationally cheaper and scales with
large number of features. In MV, you have to do large matrix
operations.

In MV m > n => number of examples should be more than number of
features. Not so in original. Since you can't inverse the matrix.

In MV, the covariance matrix(sigma) should be invertible. It will not
be invertible if there are redundant features, i.e. you have duplicate
features like x2 = x1 or x3 = x4 + x5 etc.

Wednesday, February 15, 2017

neural network notes

Study back propagation and implement gradient descent.
Implement dropout.
Cross entropy is an alternative to quadratic cost function for faster learning.

Softmax is a different activation(output) function. An alternative to Sigmoid. Sum of outputs is always 1. Hence can be thought of as a probability distribution. In a Sigmoid layer, output activations won't always sum to 1.

2 good combinations in NN are : Softmax + Log likelihood cost & Sigmoid + Quadratic cost
Usually Softmax + Log Likelihood is good for multi class classification problems.

Validation_data vs. test_data

Validation_data for tuning hyper parameters like learning rate
test_data for evaluation

Avoiding overfitting
Best way to avoid over fitting is to have larger training sets.

Regularization is another way to prevent over fitting since it pushes towards smaller weights. It means small changes in inputs will yield small changes in output. If the weights are large, small changes in input may result in large changes in output. So it's helping the model avoid the effects of noise.

L2 Regularization - add weight^2 to cost
L2 Regularization - add |weight| to cost


You could train multiple Neural networks and do a voting on their results.
Similarly, there is Dropout in which you remove half the neurons at a time which helps you adjust the weights in an average way.

Expand the data set - for images add rotations/scaling/elastic distortions, for speech - vary the speed up/down, add noise

Weight Initialization
Explore Gaussian.



Vanishing Gradient
In a multi layer NN, initial layers' learning can explode or vanish - the learning rates may be too high as compared to others or too low.

Convolutional networks

  1. Local receptive fields, stride length
  2. Shared weights and biases - all neurons in a hidden layer will have same weights and biases. So that all of them can detect the same feature at different locations. They are protected against translational changes. An image shifted slightly to right or left of something is still the image of the same thing. 
  3. Map from input layers to hidden layers is called feature map. Shared weights and biases constitute a kernel/filter.
  4. One input layer can be mapped to multiple hidden layers. That enables detection of multiple features.
  5. Later layers could be pooling layers - map 2x2 inputs to one neuron/pixel.
Recurrent Neural Networks(RNNs)
  1. Output of a neuron might be determined by its earlier value. Time based. Might fit Speech and Natural language problems.
Deep Belief Networks (DBNs)
  1. Generative - not only recognize digits, but able to produce them as well.
  2. Able to do unsupervised learning too.
  3. Restricted Boltzmann machines are a key component of DBNs.

What's going on with NNs
  1. Playing video games
  2. NLP



Tuesday, February 14, 2017

Neural network notes - 2

http://neuralnetworksanddeeplearning.com/chap1.html#eqtn3

Feed forward vs Recurrent nets
Feed forward simply give output as input to the next layer
Recurrent nets can give output to previous layer and that could come back as input after some time to the same layer.
But as of now algorithms are not good enough for Recurrent.

Weights and biases/Gradient Descent

Neural networks notes 1

http://neuralnetworksanddeeplearning.com/chap1.html

Perceptrons cam compute anything since they are NAND gates. NN is a
network of perceptrons which can adjust weights and biases, hence
better than a conventional laid out circuit. Their inputs and outputs
are 0/1. Output = w.x + b where w.x is dot product of weights and
inputs and b is bias. Bias is -threshold.

But inputs to/outputs from Sigmoid neurons can be 0.683, i.e. anything
between 0 to 1. Output activation function = 1/(1 + e^(-z)) where z =
w.x + b. If we plot it, it's a smoothed version of step function or
Perceptron. Which gives it the property that small changes in inputs
result in small changes in output, unlike Perceptron. This property is
helpful in tuning of a NN, otherwise small changes in inputs will
result in significant changes down the line.

Still, Sigmoids and Perceptrons are similar in the sense that for
large z, output is 1 for small z output is 0.

Essentially, Δoutput is a linear function of the changes Δwj and Δb in
the weights and bias.

We can use other activation functions too, but σ(z) ≡ 1/(1+e^-z) is
popular since exponential has nice differential properties.

letsencrypt interfact not found

rm -rf /root/.local/share/letsencrypt
wget https://raw.githubusercontent.com/letsencrypt/letsencrypt/master/letsencrypt-auto
chmod +x letsencrypt-auto
./letsencrypt-auto --debug renew

exponential function and e

f'(e^x) = e^x, rate of change at any x = e^x
integration(1/x) = ln(x), i.e. area under the curve from x = 1 to x =
k is ln(k), for k = e, area is 1.
slope at x=0 of e^x is 1
compound interest, (1 + 1/n)^n = e as n approaches infinity
e = sigma(1/fact(n))
e^x = sigma(similar)

Sunday, February 12, 2017

Numpy vs Tensorflow Matrix multiplication

Numpy is much much faster (Note: Used CPU version, not the GPU version)
import numpy
import tensorflow as tf
import time

def getTestData():
    A = [[1., 2., 3., 4.],[3.,4.,5.,6.],[7.,8,9.,10.],[11.,12.,13.,14.]]
    return 6,A

def tfMatMul():
    n,A = getTestData()
    A = tf.constant(A)
    sess = tf.Session()
    for num in range(1,n):
        A = tf.matmul(A,A)
        output = sess.run(A)
        A = tf.convert_to_tensor(output)
    sess.close()
    return output

def numPyMatMul():
    n,A = getTestData()
    for num in range(1,n):
        A = numpy.matmul(A,A)
    return A


def timedRun(methodToRun):
    start = time.time()
    result = methodToRun()
    end = time.time()
    diff = end - start
    print("Time Taken :"+str(diff))
    print(result)

timedRun(numPyMatMul)
timedRun(tfMatMul)

Saturday, February 11, 2017

conda commands

conda create --name test35 python=3.5
conda info --envs (list)
activate test35
deactivate test35
conda remove --name test35 --all
conda install -y scipy

PyCharm with Anaconda - Using a specific environment

Search for "Project Interpreter"
Click on Settings Icon(Wheel)
Add Local
Choose python.exe in your env, path is like Anaconda3\envs\<env_name>\python.exe

Thursday, February 9, 2017

scala notes

https://www.safaribooksonline.com/library/view/scala-for-the/9780134510613/

for yield => map
for yield with guard clause => filter
reduceLeft

clojures - how are they implemented? In scala, they are objects which capture the method and bindings of free variables.

Expression evalutation (Recursive data structures)
abstract class Expr
case class Num(value: Int) extends Expr
case class Sum(left: Expr, right: Expr) extends Expr
case class Product(left: Expr, right: Expr) extends Expr

val e = Product(Num(3), Sum(Num(4), Num(5)))
def eval(e: Expr):Int = e match {
  case Num(v) => v
  case Sum(l,r) => eval(l) + eval(r)
  case Product(l,r) => eval(l) * eval(r)
}
eval(e)

Expression evalutation (Recursive data structures) - OOP version
abstract class Expr {
  def eval: Int
}

class Num(val data: Int) extends Expr {
  def eval: Int = data
}
class Product(val left: Expr, val right: Expr) extends Expr {
  def eval: Int = left.eval * right.eval
}
class Sum(val left: Expr, val right: Expr) extends Expr {
 def eval: Int = left.eval + right.eval
}

val e = new Product(new Num(3), new Sum(new Num(4), new Num(5)))
e.eval
So what to use, Polymorphism version or case classes.
Use case classes when your cases are bound. Like here. There is a finite set of expressions.
Use Polymorphism 


Wednesday, February 8, 2017

scala notes

https://www.safaribooksonline.com/library/view/scala-for-the/9780134510613/

packages => nesting can be all at one place, no need to have similar source directories
imports are flexible, can import specific classes, can hide a class, can alias, can import anywhere (lexical scoping)

Traits (like Java interfaces)
But much more powerful

Traits cannot have construction parameters, otherwise they are same as classes.
traits can be mixed in with objects, rather than class declaration
traits can invoke others in a priorlayer (consolelogger, timestamplogger, shortlogger)

Thursday, February 2, 2017

Changing the port for react app(create-react-app)

Built with create react app

Edit node_modules/react-scripts/scripts/start.js

Search for DEFAULT_PORT and modify as follows:

var DEFAULT_PORT = process.env.PORT || 80;

Tuesday, January 31, 2017

Redis (Cluster) notes

Benchmark tool
Comes with Redis. Very good with variety of options. 

Server on AWS  (client on my local machine)
PS C:\Users\user> redis-benchmark -t set -n 100000 -h <MY_AWS_FREE_TIER_IP> -p 81
====== SET ======
  100000 requests completed in 40.93 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

0.00% <= 14 milliseconds
0.22% <= 15 milliseconds
10.06% <= 16 milliseconds
24.73% <= 17 milliseconds
30.37% <= 18 milliseconds
32.67% <= 19 milliseconds
43.62% <= 20 milliseconds
63.83% <= 21 milliseconds
71.76% <= 22 milliseconds
74.68% <= 23 milliseconds
79.34% <= 24 milliseconds
89.36% <= 25 milliseconds
95.18% <= 26 milliseconds
97.42% <= 27 milliseconds
98.39% <= 28 milliseconds
98.87% <= 29 milliseconds
99.11% <= 30 milliseconds
99.36% <= 31 milliseconds
99.54% <= 32 milliseconds
99.65% <= 33 milliseconds
99.72% <= 34 milliseconds
99.80% <= 35 milliseconds
99.85% <= 36 milliseconds
99.89% <= 37 milliseconds
99.91% <= 38 milliseconds
99.93% <= 39 milliseconds
99.95% <= 40 milliseconds
99.96% <= 41 milliseconds
99.97% <= 42 milliseconds
99.98% <= 44 milliseconds
99.98% <= 45 milliseconds
100.00% <= 46 milliseconds
100.00% <= 50 milliseconds
100.00% <= 50 milliseconds
2443.38 requests per second

Localhost: (both client and server)
PS C:\Users\user> redis-benchmark -t set -n 100000
====== SET ======
  100000 requests completed in 1.42 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

89.97% <= 1 milliseconds
99.88% <= 2 milliseconds
99.91% <= 3 milliseconds
99.93% <= 4 milliseconds
99.95% <= 8 milliseconds
99.96% <= 9 milliseconds
99.98% <= 10 milliseconds
99.99% <= 11 milliseconds
99.99% <= 12 milliseconds
99.99% <= 13 milliseconds
99.99% <= 14 milliseconds
99.99% <= 15 milliseconds
100.00% <= 16 milliseconds
100.00% <= 17 milliseconds
100.00% <= 18 milliseconds
100.00% <= 19 milliseconds
70521.86 requests per second

With pipelining (QPS is much higher but latency for 90 percentile requests is much higher)
PS C:\Users\user> redis-benchmark -t set -n 100000 -P 100
====== SET ======
  100000 requests completed in 0.21 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

0.00% <= 1 milliseconds
0.60% <= 2 milliseconds
1.00% <= 3 milliseconds
1.10% <= 4 milliseconds
1.30% <= 5 milliseconds
1.40% <= 6 milliseconds
10.70% <= 7 milliseconds
35.80% <= 8 milliseconds
46.70% <= 9 milliseconds
57.00% <= 10 milliseconds
69.70% <= 11 milliseconds
79.10% <= 12 milliseconds
86.60% <= 13 milliseconds
91.70% <= 14 milliseconds
94.00% <= 15 milliseconds
97.00% <= 16 milliseconds
99.30% <= 17 milliseconds
100.00% <= 17 milliseconds
467289.72 requests per second

  1. Redis is single threaded, can fork another thread(process?) for persistence.
  2. If running in cluster mode, one node should have n/2 instances (master + slave) where n = NUM_CORE since one process for serving commands, another for persistence
  3. RDB vs AOF persistence
  4. Slave can be configured to become master if master hasn't been contacted in a while.
  5. Total ~16K Hashslots.
  6. Resharding (redistribution of keys) is always manual, whether adding a node or deleting one. Failover is automatic since slave already has the the same keys as master.
  7. Pipelining will increase throughput but 90-95 percentile latency will be very high. Essentially 100 percentile latency will be lower than non-pipeline version but for other percentiles it will be very high.
  8. Others : Recently Geo commands were added. Though tiles38 is also there for the similar stuff.

Forgot alias for android keystore file

keytool -v -list -keystore file.jks

keytool is in jre/bin

Sunday, January 29, 2017

Using Apache as forward proxy

<VirtualHost *:80>
  ServerName proxy.yourdomain.com
  ProxyRequests On
  SSLProxyEngine On

  ProxyPass        /revoke https://myca.com/revoke
  ProxyPassReverse /revoke https://myca.com/revoke

  <Location />
    Order Deny,Allow
    Allow from all
  </Location>
</VirtualHost>

using apache/nginx as reverse proxy server (map to different ports based on domain name)

Assuming nginx is running on port 82 and you want to serve nginx.domain.com with nginx.


<VirtualHost *:80>
    ServerAdmin me@mydomain.com
    ServerName nginx.domain.com
    ProxyPreserveHost On

    # setup the proxy
    <Proxy *>
        Order allow,deny
        Allow from all
    </Proxy>
    ProxyPass / http://localhost:82/
    ProxyPassReverse / http://localhost:82/
</VirtualHost>

CSRF/XSS summary

If you are using cookies for authentication, someone can embed URLs from your domain in a random webpage to trigger side effects.
For e.g. if you have a GET url for http://domain.com/logout and someone creates <img src="http://domain.com/logout"/> in his webpage and your user visits that page, he would be immediately logged out.
Similarly POST URLs can be embedded in <form> elements.

But if you are not using cookies, for e.g. you might be using JWT and storing the token in localstorage, you are safe.

Best solution: Don't use cookies for authentication

XSS
-----

A bad guy posts a message on a forum. Message contains a js script tag. Whenever anyone visits the forum, that javascript runs and steals that person's cookie.

Solution: Escape any html or user submitted content you publish on your website.

eternal bash history

# Eternal bash history.  # ---------------------  # Undocumented feature which sets the size to "unlimited".  # http://stackoverflow.com/questions/9457233/unlimited-bash-history  export HISTFILESIZE=  export HISTSIZE=  export HISTTIMEFORMAT="[%F %T] "  # Change the file location because certain bash sessions truncate .bash_history file upon close.  # http://superuser.com/questions/575479/bash-history-truncated-to-500-lines-on-each-login  export HISTFILE=~/.bash_eternal_history  # Force prompt to write history after every command.  # http://superuser.com/questions/20900/bash-history-loss  PROMPT_COMMAND="history -a; $PROMPT_COMMAND"

Saturday, January 28, 2017

windows docker

Windows 10 only
enable hyper-v
install docker

Commands:
docker run -it --entrypoint=/bin/bash dharm0us/ubuntu
Ctrl-PQ to quit without killing container

docker images to list images
docker ps to list containers
docker attach <container-id>

docker tag <image-id> dharm0us/ubuntu:latest
docker login
docker push dharm0us/ubuntu:latest

docker commit -m"test commit" <container-id> dharm0us/ubuntu:latest
docker push dharm0us/ubuntu:latest

Share directory
 docker run -v //c/projects:/projects  -it --entrypoint=/bin/bash dharm0us/ubuntu

Run with sudo
 docker run --privileged  -v //c/projects:/projects  -it --entrypoint=/bin/bash dharm0us/ubuntu

Run with port mapped
docker run --privileged -p 80:80 -v //c/projects:/projects  -it --entrypoint=/bin/bash dharm0us/ubuntu

Inspect container config
docker inspect <container_id>

Friday, January 27, 2017

setting up Go/GoLang on Windows

1. Download and install at C:/go
2. It should set GOROOT at C:/go, if it doesn't setx GOROOT C:/go
3. setx GOPATH C:/projects/go
4. cd C:/projects/go
5. mkdir src; mkdir src/proj
6. cd src/proj
7. touch main.go
8. touch constants.go
9. go get ./...
10. cd ..
11. go build proj/
12. ./proj.exe

main.go
package main

import (
       "fmt"       "github.com/ChimeraCoder/anaconda"       "net/url")

func main() {
       fmt.Println("hi")
       anaconda.SetConsumerKey(CONSUMER_KEY)
       anaconda.SetConsumerSecret(CONSUMER_SECRET)
       api := anaconda.NewTwitterApi(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

       v := url.Values{}
       v.Set("track", "Kohli")
       twitterStream := api.PublicStreamFilter(v)
       for {
              x := <-twitterStream.C
              switch tweet := x.(type) {
              case anaconda.Tweet:
                     fmt.Println(tweet.Latitude())
                     fmt.Println(tweet.Text)
                     fmt.Println("-----------")
              case anaconda.StatusDeletionNotice:
              // pass              default:
                     fmt.Printf("unknown type(%T) : %v \n", x, x)
              }
       }
}
Constants.go
package main

const CONSUMER_KEY = ""const CONSUMER_SECRET = ""const ACCESS_TOKEN = ""const ACCESS_TOKEN_SECRET = ""


Saturday, January 21, 2017

intellij idea setting java language level to 8

Project Settings -> Modules -> Sources -> Language level : 8
Project Settings -> Project -> Project SDK -> 1.8
Project Settings -> Project -> Project Language Level -> 8
File > Settings > Build, Execution, Deployment > Java Compiler >
Project Bytecode version -> 1.8
File > Settings > Build, Execution, Deployment > Java Compiler > Per
module Bytecode version -> 1.8

Friday, January 20, 2017

today's summary

select/poll/epoll used by memcached/redis/nodejs/golang
epoll is latest and linux only.
30% of world's computers are still Windows XP.

nodejs vs golang async
golang - goroutines on top of OS threads, proud to be blocking (goroutine will take care of that), goroutines are cheap
nodejs - single threaded, use callback else it will be blocked

websocket server - Golang is the right compro, perf less than C++ but code much simpler. Nodejs far behind.

In GoLang - you can use utf-8 characters as var names

redis and tiles38 for geospatial data
tiles38 has replication as well and uses Redis RESP protocol

Thursday, January 19, 2017

nginx setup amazon linux

tar -xvzf nginx-1.11.8.tar.gz
cd nginx-1.11.8
./configure --sbin-path=/usr/local/sbin --with-http_ssl_module
make
make install


vim /usr/local/nginx/conf/nginx.conf

Add the following to server section:
   listen       80;
        server_name  domain.com;

        #charset koi8-r;

        #access_log  logs/host.access.log  main;

        location / {
            root  /var/www/html/folder;
            index  index.php index.html index.htm;
        }
        location ~ \.php$ {
                fastcgi_pass   unix:/var/run/php-fpm/php-fpm.sock;
                fastcgi_index  index.php;
                fastcgi_param  SCRIPT_FILENAME  /var/www/html/folder$fastcgi_script_name;
                include        fastcgi_params;
        }




nginx(simple to start) or /usr/local/sbin/nginx
nginx -s stop

sudo yum install php56-fpm
nginx -t to locate config file
service php-fpm start/stop/restart
vim  /etc/php-fpm.d/www.conf -> change user and group
vim /etc/php-fpm.conf

summing memory used on linux

 ps aux --sort rss | awk '{sum+=$5;print $5,sum}'
for RSS
 ps aux --sort rss | awk '{sum+=$4;print $4,sum}'
for VSZ

Blog Archive