Wednesday, December 27, 2017

cryptography course - Authenticated Encryption

Authenticated Encryption
How to secure against tampering.

If message needs integrity but no confidentiality - use MAC
If message needs integrity and confidentiality - use Authenticated Encryption

3 options:
SSL(Mac-then-Encrypt),IPSec(Encrypt-then-MAC),SSH (Encrypt-and-MAC) => IPSec is the best one to provide AE
Standards:
GCM, CCM, EAX

OCB : a direct construction from a PRP - Efficient in the sense that you don't have to invoke AES(or another block cipher) twice - once each for encryption and MAC
- parallel
But OCB is not widely used and not a standard - primarily due to various patents

TLS
AE in real world

Attacks
IMAP over TLS
Padding Oracle
Attacking non-atomic decryption => 

KDF
HKDF - key derivation function from HMAC (Generating multiple keys from one key)
Password based KDF - PBKDF/PKCS

Searching on Encrypted data
Deterministic Encryption - cannot be CPA secure. Solution - pair (k, m) is unique. Same message won't be encrypted by the same key. CBC with fixed IV is not det. CPA secure.
SIV with wide PRP.
EME

Disk Encryption
Encryption cannot expand original text. Sector size fixed.
If 2 sectors have same content, their cipher texts will also be the same. Information will leak.
First, approach - let's use different keys for different sectors.
But even with this approach, user can still change the text and then revert it to find a leakage or pattern.
Tweakable block cipher - where tweak comes from sector number.
XTS tweakable block cipher

Use tweakable encryption when you need many independent PRPs from one key.

Format preserving encryption
Credit card encryption - 

Wednesday, December 13, 2017

Cryptography course - Message integrity

Let's talk about how to ensure integrity rather than confidentiality - for e.g. banner ads.
CRC is not enough. We need a shared key with both parties.
MAC - Message Authentication Codes => S,V
S(k,m) -> t
V(k,m,t) -> 0,1

Popular variations using AES
CBC-MAC
H-MAC

Truncated PRF is also secure if 1/2^w is negligible where w is the length after truncation.

Encrypted CBC-MAC (ECBC)
Raw CBC which doesn't do the final encryption with a different key.

NMAC (Nested MAC)
Output is in the key space. As opposed to ECBC where output is in X.

In both NMAC and ECBC last encryption step is required else it's insecure.

AES based ECBC is the most popular MAC algo.
AES based ECBC should not be used for more than 2^48 messages.

Message padding
If we append 0s at the end to pad the message, it's risky. Let's a cheque of amount 1 is the message. We pad 0s at the end, which makes it 1000. Now, both 1 and 1000 have the same tag!!
So, padding must be invertible. If m0 != m1, pad(m0) != pad(m1) should hold.

ISO standard
So, pad with 100..00. While removing the pad, keep removing till you get the first 1.
If the message is already a multiple of the block size, add a pad still.

Using CMAC we can avoid padding for messages which are multiple of block sizes.
If the message is multiple of block size, encrypt the last block with K2. If not, pad and encrypt with K1.

PMAC
Parallel, incremental.

One time MAC - parallel of one time pad for integrity
Carter wegman MAC - build many time MAC from one time MAC

Collision resistance - Merkle Damgard paradigm
Davies meyer compression function.


Timing attacks on MAC verification



Thursday, December 7, 2017

Cryptography course - AES

AES is a subs-perm network not Feistel (in which half the bits remain unchanged in every round). Here all bits change in every round.
Intel Westmere and AMD Bulldozer architectures have special instructions for AES.
AES implementation will have shortest code when tables are not pre computed and vice versa.
So, to transmit AES implementation to browser, just the code is sent and browser pre computes the table.
On AES-128 best known attacks are only 4 times faster than exhaustive search - i.e. 2^126. AES-256 though can be broken in 2^99.

Can we build a PRF from a PRG? Though the ultimate goal is to build PRP.
Answer is yes. It's called GGM PRF.
If we have a PRF, we can plug it in Luby-Rackoff theorem which says that PRF + 3 round Fiestel will give us PRP.
But constructing a PRP like this is very slow in practice, so it's not used.

Any secure PRP is a secure PRF if |X| is sufficiently large. For e.g. |X| for AES is 2^128. 

Now,
How to correctly encrypt long messages with block ciphers?
If two parts of the message are same, and the block cipher is not long enough to encrypt the full message, attacker can gain information about the underlying text.
One way to solve it is to use - Deterministic Counter mode.

Sol 1
Randomized Encryption
Given the same PT, output different CT every time due to randomization. But the size of CT increases since the randomness is encoded in the message.

Sol 2
Nonce based Encryption
Message is encrypted using (k, n) and this pair never repeats. n could be public too. For e.g. for HTTP(s) packet counter can be used as n since packets arrive in order.

Cipher block chaining - with random IV
but if IV is predictable, CPA challenge will fail. In SSL/TLS this was a bug.

CBC -with nonce
Another key to encrypt nonce since nonce has to be random.

Randomized counter mode (CTR)
Parallelizable - Unlike CBC

Advantages of CTR over CBC
Parallel
Requires PRF rather than PRP
Better error bounds
No Padding required

But all these encryptions don't really protect against message tampering. 


Wednesday, November 29, 2017

Cryptography course notes - Block Ciphers



https://www.coursera.org/learn/crypto

PRP/PRF - Psuedo Random Permutations/Functions and Block Ciphers are the terms which can be used independently.
They map a block of size n to output block of size n. For 3DES, n = 64 bits and k(key size) = 168 bits. For AES, n = 128 bits, k = 128, 192, 256
Key is expanded in round keys which are in turn applied to input. It's parallel in nature.

Key expansion. Round function.

Basic building block - Feistel netwrok - take functions which map n bits to n bits and construct invertible functions which map 2n bits to 2n bits.
DES is 16 round Feistel network. Function is same in all rounds, just that key is different.


Luby Rackoff theorem - 3 round Feistel network gives a secure PRF. Argue that 2 round is not secure.

Breaking DES with exhaustive search
Given 1 m,c pair there is 99.5% probability that a certain key was used.
With 2 pairs, prob. is even higher.
56-bit DES has been broken in 3 months, 3 days, 22 hours. In 7 days with 120 off-the-shelf FPGAs.
With double-DES, it can be broken in 2^63 time (vs 2^56 for single-DES) due to man in the middle attack. For triple-DES it's 2^118. 

DES-X can be used to make DES more secure by doing k1 XOR (E (k2, m XOR k3))

Attacks on the implementation

Side channel attacks
For e.g. if a smart card has a key which is used for encryption/decryption - by observing power consumption you can guess 1) number of cycles (peaks/troughs count) 2) by zooming in on the graph even 0s and 1s can be guessed of the key. 3) Smart card companies try to mask the power but by differential power analysis it can still be broken.

Also, if one core is used for enc/dec, on another core the attack can run which can monitor cache misses since both the cores share the cache.

Fault attacks
Computing errors in the last round expose the secret key k

Linear and Differential attacks
Given many inp/oup pairs, can we recover key in time less than 2^56 (Exhaustive search).
Linear cryptoanalysis -
If you XOR subset of message bits with subset of CT resulting value should be equal to subset of original key with prob = 1/2
But if it's 1/2 + epsilon then there is some bias.
In case of DES epsilon is non-trivial which breaks DES in 2 POW 42. Turns out the 5th S-box (S5) has some linearity which results in 2 POW 42 attack.

Quantum attacks:
Generic search problem:
Given f:X->{0,1}
Goal : find x e X s.t. f(x) = 1

Classical computer will take O(|X|) time. Quantum computing can help us to solve this problem faster, in O(|X| pow 1/2). A quantum computer can break DES in 2 POW 28 as opposed to 2 POW 56. And AES in 2 POW 64. These days 2 pow 64 is considered insecure. 

Sunday, November 19, 2017

Cryptography course notes

https://www.coursera.org/learn/crypto

Stream Ciphers 4 - What is a secure cipher?



Statistical tests - given an input it will tell how random it is.
Advantage - |Pr(A(PRG) = 1) - Pr(A(R) = 1)| A is statistical test which will return 1 if it thinks input is random enough. Advantage is close to 1 if A can distinguish very well between a truly random number and PRG random number else it's close to 0.

A PRG is secure if ADV_PRG[A,G] is negligible. It means it's difficult to distinguish between PRG and truly random.
Are there provably secure PRGs? We don't know. It's linked to P = NP.

Secure PRGs are unpredictable. Given first i bits if an algo can predict the i+1 bit with prob > 1/2 + epsilon where epsilon is non-neg then PRGs is predictable and Advantage > epsilon.
Theorem => if for all i in (0 to n-1) PRG G is unpredictable at position i then G is secure PRG.
If next bit predictors can't distinguish G from random then no statistical test can.

Semantic Security - if attacker can't distinguish between Exp(0) and Exp(1) - i.e. m0 and m1. Definition similar to advantage.

Quiz
<?php 
    $cipherText = '6c73d5240a948c86981bc294814d';
    $originalText = 'attack at dawn';
    $newText = 'attack at dusk';
    $otpInAscii = pack('H*',$cipherText) ^ $originalText;
    $newCipherText = bin2hex($otpInAscii ^ $newText);
    echo $newCipherText;
?>

Stream cipher with scure PRG is semantically secure - 

Wednesday, November 15, 2017

Coursera cryptography notes

https://www.coursera.org/learn/crypto

W1S5
1. Problems with RC4 (used in HTTPS/WEP), some bytes have higher prob
of being 0.
2. CSS(Content scrambling system) used for DVDs, Bluetooth, GSM -
implemented in hardware is badly broken. It uses LFSR. US allowed
export of crypto algorithms which weren't more than 40 bits. Hence DVD
manufacturers were constrained.

3. Modern stream ciphers - eStream - Salsa20(elegant) Sosemanuk

Tuesday, November 14, 2017

Coursera cryptography notes

https://www.coursera.org/learn/crypto

W1S4
1. If you use same pad to encrypt multiple messages m1,m2 - an
attacker can XOR resulting CTs C1,C2 = m1 XOR m2 from which one can
recover easily the original messages since there is plenty of
redundancy in English esp ASCII.

2. Real world failures of the 2 time pad - Project Venona (US vs
Russia - 1941-46). MS-PPTP (Windows NT) wherein both server and client
used the same key to encrypt messages. Also 802.11b WEP - IV || k is
used to encrypt a frame. Length of IV is 24 bits. So after 16M frames,
encrypting key gets recycled. So 2 diff msgs encrypted with same key.
Also if you reset router, IV gets reset to 0 - so it will get recycled
faster than normal. Also IV goes like this - 0,1,2,3 so all the keys
are closely related. The PRG used by WEP is RC4 which was demonstrated
to fail after 10^6 frames.

3. Disk encryption fail -
4. OTP is malleable. If attacker has access to CT, he can XOR it with
some pattern to modify the resulting message.

Coursera cryptography notes


W1S1-2
1. Anything which can be done with a Trusted authority can be done without it through some secret protocol communication among all the parties.
2. If there are 2 Random variables with uniform distribution, their XOR is also a uniform distribution.
3. Birthday paradox - 1.2 * sqrt(size(U)) samples would yield 2 distinct elements with same values where size(U) is the size of the entire set. 1.2*sqrt(365) = 24 people in a room would yield 2 people with same birthday. 2^64 samples of 128 bit numbers would yield 2 same numbers. Probability of this happening is >= 0.5

W1S3
1. Definition of perfect secrecy (E,D) over (K,M,C), Pr [ E(k,m0) = c] = Pr[E(k,m1) = c] given that |m0| = |m1|. In other words, CT only attacks are not possible. So One Time Pad (OTP) as perfect secrecy. OTP is simply m XOR k = c.
2. Perfect secrecy also requires that len(k) >= len(m) . OTP satisifies this with equality. So OTP is not practical since if you can transmit the key securely, you can as well transmit the message securely as well(they are the same length).

How to make OTP more secure with stream ciphers?
1. PRG but PRG must be unpredictable. Predictable means that given first few bits of PRG output I can deduce the rest of the bits. If that's so, if the attacker knows first few bits of m and sees the CT, by XORing can get first few bits of PRG output. From those first few bits, can generate rest of the bits.
2. Weak PRGs - A. glibc random() B. LCG 
3. Negligible/non negligible epsilon corresponds to polynomial/exponential


 

Tuesday, September 5, 2017

Windows checking which port is used by which process

Task Manager -> Performance -> Open Resource Monitor -> Network -> Listening Ports

If you are looking to close the process using port 80 - Net stop HTTP

Wednesday, August 30, 2017

glog

alias glog="git log --pretty=format:\"%h%x09%an%x09%aD%x09%cD%x09%s\""

Friday, July 14, 2017

Hyderabad tennis academies and coaches (and some emerging players)

Power tennis academy - C.V. Nagraj
Sinnet Tennis Academy(Ravi Chander Rao) - Famous Players - Sai dedeepya
Sunjay Tennis Academy(Sunjay sir) - Famous Players - Pranjala yadlapalli
Ace Tennis Academy (Praveen sir) - Famous players - B. Shrivalli Rashmikaa

Emerging players
Sai dedeepya
Pranjala yadlapalli
Mummadi Vineetha
B. Shrivalli Rashmikaa

Thursday, June 8, 2017

Security training - All 18 attack vectors

Command Injection
  1. If you are running System commands - don't take user input blindly which would be used in commands
  2. Apply filtering, rules.
  3. For e.g. user can provide ';id' as part of the input which will expose current user details.
  4. Simple fix - put a regex filter to ensure what all is allowed.
SQL Injection
  1. Don't trust user input blindly.
  2. Escape or rather use parametrized queries.
Session FixationIf you take user supplied session identifier, another user could use that to take the identity of someone else.

Use of insufficiently random values - for e.g. to generate session ids

Reflected XSS (Cross side scripting)

If you print user provided input as it is :
For e.g.

https://tradesearch.codebashing.com/projects?search=%3Cscript%3Ealert('You got hacked')%3C%2Fscript%3E

So, escape user input before printing.

Remedy in Java
In JSP - use <c:out>

Persistent XSSIf you allow the user to submit values to database which will be later printed on a web page. For e.g. if I am viewing someone else's profile, and that person has put javascript in his name - that javascript will execute in my session and send out my cookie information somewhere.

Remedy in Java
Again use <c:out> in JSP

Directory traversal
If you allow user to supply filepath as a request parameter.

For e.g. https://traderesearch.codebashing.com/article/show_file?file=../etc/passwd
Remedy
Don't allow access in this manner. Rather provide ids for files. And manage user access to those ids.
Else

public void doGet(HttpServletRequest request, HttpServletResponse response) {

  String result;

  String filename = request.getParameter("file");


  try {

     File file = new File("/tmp/" + filename);
String canonicalPath = file.getCanonicalPath();
if(!canonicalPath.startsWith("/tmp/")) {
  throw new GenericException("Unauthorized access");
}
BufferedReader reader = new BufferedReader(new FileReader(file));
     String line = null;
     while((line = reader.readLine()) != null) {
        result = line;
     }

  } catch (Exception e) {

     e.printStackTrace();
  }

  try {

Privileged interface exposure

For e.g. remove your admin interfaces from google search results.
If they ain't supposed to be exposed outside, don't.

Leftover Debug code

For e.g. you can leave a comment in JS code that add debug =1 to URL. Remove such comments.

Authentication credentials in URL - logs etc.


Session exposure within URL  - 


User Enumeration - for e.g. forgot password interface, don't show whether successful or not.


Horizontal privilege escalation- for e.g. just by changing user id I can access someone else's edit profile page.


Vertical privilege escalation  - For e.g. if I just replace "user" with "admin" I get to another page with more privileges - https://tradesocial.codebashing.com/trade_social/admin/show


Insecure redirects - for e.g. yourserver.com/redirect?url=someotherurl . So your server is responsible for redirects. Don't redirect to urls which you don't trust.


Click Jacking - Display an iframe of my stock broking website with opacity 0 in an iframe and place "Research report" button on the malicious website such that clicking that button actually clicks a button beneath in the iframe and sells my stocks.


Remedy :


To defend against Click Jacking attacks the web server should be configured to send X-Frame-Options in the response headers. 


Configuring the X-Frame-Options to either DENY, SAMEORIGIN, or ALLOW-FROM will prevent malicious websites from embedding your application's content. 

I.e. it will prevent the browser from loading your application content within <frame> or <iframe> tags of other websites. 

CSRF  - Cross site request forgery (GET) - if I am using GET requests to perform actions like delete/update etc., someone can embed those URLs in a webpage as img src. If I open that web page, using my cookies/session that URL will perform updates on server.

CSRF (POST) - A malicious website could include a form with your POST URL. And if someone clicks on submit, it would use your cookies etc. to execute that action. Or even without clicking on submit, form could be submitted programatically.
Remedy - use Anti CSRF token, synchronizer token, challange token etc.


XML External Entity Injection

  1. If users are allowed to upload XML files which your server is going to parse, you are at risk.
  2. An XXE attack works by taking advantage of a feature in XML, namely XML eXternal entities (XXE) that allows external XML resources to be loaded within an XML document.
  3. By submitting an XML file that defines an external entity with a file:// URI, an attacker can effectively trick the application's SAX parser into reading the contents of arbitrary file(s) that reside on the server-side filesystem.

Example:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE foo [<!ELEMENT foo ANY >
<!ENTITY bar SYSTEM "file:///etc/passwd" >]>

<trades>
    <metadata>
        <name>Apple Inc</name>
        <stock>APPL</stock>
        <trader>
   
<foo>&bar;</foo>
</trades>

  1. Java web applications using XML libraries are particularly vulnerable to external entity (XXE) injection attacks because the default settings for most Java XML SAX parsers is to have XXE enabled by default.
  2. To use these parsers safely, you have to explicitly disable referencing of external entities in the SAX parser implementation you use. We'll revisit this in the remediation section later.
  3. Problem code:
public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
        } catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

Remedy:
  1. Because user supplied XML input comes from an "untrusted source" it is very difficult to properly validate the XML document in a manner to prevent against this type of attack.
  2. Instead the XML processor should be configured to use only a locally defined Document Type Definition (DTD) and disallow any inline DTD that is specified within user supplied XML document(s).
  3. Due to the fact that there are numerous XML parsing engines available for Java, each has its own mechanism for disabling inline DTD to prevent XXE. You may need to search your XML parser's documentation for how to "disable inline DTD" specifically.

For example with Java XMLInputFactory:
xmlInputFactory.setProperty(
XMLInputFactory.SUPPORT_DTD, false
);

Let's see how the above fix can be applied to our vulnerable example to remediate the XXE vulnerability.

Improved code :

public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
documentBuilderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", false);    
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
} catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

DOM XSS (Cross side scripting)

1. Landing page URL : https://tradenews.codebashing.com/guests/landing#guest
2. Pseudocode for Landing JSP web page

 <h6>
<script>
   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest
   document.write("Hello " + name + "! Please login or signup to access news stories");
</script>
</h6>


Attack
Part 1 : Attacker sends email like this :

Hi Alice, Hope you are well ! Please find below a 20% discount code to join TradeNEWS.

https://tradenews.codebashing.com/guests/landing#<script>window.location = 'https://fake-tradenews.codebashing.com';</script> Kind Regards, Bob



Note the username is replaced with javascript code. Once user clicks on the link, he/she is redirected to a fake website. There credit card information can be stolen.


RemedyTo defend against Cross Site Scripting attacks within the application user's Browser Document Object Model (DOM) environment a defense-in-depth approach is required, combining a number of security best practices.

Note You should recall that for Stored XSS and Reflected XSS injection takes place server side, rather than client browser side. Whereas with DOM XSS, the attack is injected into the Browser DOM, this adds additional complexity and makes it very difficult to prevent and highly context specific, because an attacker can inject HTML, HTML Attributes, CSS as well as URLs.

As a general set of principles the application should first HTML encode and then Javascript encode any user supplied data that is returned to the client. For example using OWASP ESAPI:



document.write(<%=Encoder.encodeForJS(Encoder.encodeForHTML(userSuppliedData))%>);



Due to the very large attack surface this approach is no silver bullet, and as such developers are strongly encouraged to review areas of code that are potentially susceptible to DOM XSS, including but not limited to:



window.name
document.referrer
document.URL
document.documentURIlocation
location.href
location.search
location.hash
eval
setTimeout
setInterval
document.write
document.writeIn


Note: OWASP ESAPI (The OWASP Enterprise Security API) is a free, open source, web application security control library that makes it easier for programmers to write lower-risk applications.

Let's apply a suitable Regex pattern to remediate this particular DOM XSS vulnerability.

<h6>

<script>



   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest



if (name.match(/^[a-zA-Z0-9]*$/))

{

   document.write("Hello " + name + "! Please login or signup to access news stories");

}

else

{

window.alert("Security error");

}



</script>

security training - XXE injection - Attack vector 2

XML External Entity Injection

  1. If users are allowed to upload XML files which your server is going to parse, you are at risk.
  2. An XXE attack works by taking advantage of a feature in XML, namely XML eXternal entities (XXE) that allows external XML resources to be loaded within an XML document.
  3. By submitting an XML file that defines an external entity with a file:// URI, an attacker can effectively trick the application's SAX parser into reading the contents of arbitrary file(s) that reside on the server-side filesystem.

Example:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE foo [<!ELEMENT foo ANY >
<!ENTITY bar SYSTEM "file:///etc/passwd" >]>

<trades>
    <metadata>
        <name>Apple Inc</name>
        <stock>APPL</stock>
        <trader>
     
<foo>&bar;</foo>
</trades>

  1. Java web applications using XML libraries are particularly vulnerable to external entity (XXE) injection attacks because the default settings for most Java XML SAX parsers is to have XXE enabled by default.
  2. To use these parsers safely, you have to explicitly disable referencing of external entities in the SAX parser implementation you use. We'll revisit this in the remediation section later.
  3. Problem code:
public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
        } catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

Remedy:
  1. Because user supplied XML input comes from an "untrusted source" it is very difficult to properly validate the XML document in a manner to prevent against this type of attack.
  2. Instead the XML processor should be configured to use only a locally defined Document Type Definition (DTD) and disallow any inline DTD that is specified within user supplied XML document(s).
  3. Due to the fact that there are numerous XML parsing engines available for Java, each has its own mechanism for disabling inline DTD to prevent XXE. You may need to search your XML parser's documentation for how to "disable inline DTD" specifically.

For example with Java XMLInputFactory:
xmlInputFactory.setProperty(
XMLInputFactory.SUPPORT_DTD, false
);

Let's see how the above fix can be applied to our vulnerable example to remediate the XXE vulnerability.

Improved code :

public class TradeDocumentBuilderFactory {

    public static DocumentBuilderFactory newDocumentBuilderFactory() {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        try {
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", true);
//              documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", true);
documentBuilderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-general-entities", false);      
documentBuilderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
} catch(ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        return documentBuilderFactory;
    }

}

security training - DOM XSS - Attack vector 1

DOM XSS (Cross side scripting)

1. Landing page URL : https://tradenews.codebashing.com/guests/landing#guest
2. Pseudocode for Landing JSP web page

 <h6>
<script>
   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest
   document.write("Hello " + name + "! Please login or signup to access news stories");
</script>
</h6>


Attack
Part 1 : Attacker sends email like this :

Hi Alice, Hope you are well ! Please find below a 20% discount code to join TradeNEWS.

https://tradenews.codebashing.com/guests/landing#<script>window.location = 'https://fake-tradenews.codebashing.com';</script> Kind Regards, Bob



Note the username is replaced with javascript code. Once user clicks on the link, he/she is redirected to a fake website. There credit card information can be stolen.


RemedyTo defend against Cross Site Scripting attacks within the application user's Browser Document Object Model (DOM) environment a defense-in-depth approach is required, combining a number of security best practices.

Note You should recall that for Stored XSS and Reflected XSS injection takes place server side, rather than client browser side. Whereas with DOM XSS, the attack is injected into the Browser DOM, this adds additional complexity and makes it very difficult to prevent and highly context specific, because an attacker can inject HTML, HTML Attributes, CSS as well as URLs.

As a general set of principles the application should first HTML encode and then Javascript encode any user supplied data that is returned to the client. For example using OWASP ESAPI:



document.write(<%=Encoder.encodeForJS(Encoder.encodeForHTML(userSuppliedData))%>);



Due to the very large attack surface this approach is no silver bullet, and as such developers are strongly encouraged to review areas of code that are potentially susceptible to DOM XSS, including but not limited to:



window.name
document.referrer
document.URL
document.documentURIlocation
location.href
location.search
location.hash
eval
setTimeout
setInterval
document.write
document.writeIn


Note: OWASP ESAPI (The OWASP Enterprise Security API) is a free, open source, web application security control library that makes it easier for programmers to write lower-risk applications.

Let's apply a suitable Regex pattern to remediate this particular DOM XSS vulnerability.

<h6>

<script>



   var name = document.location.hash.split('#')[1]; //https://tradenews.codebashing.com/guests/landing#guest



if (name.match(/^[a-zA-Z0-9]*$/))

{

   document.write("Hello " + name + "! Please login or signup to access news stories");

}

else

{

window.alert("Security error");

}



</script>

Wednesday, June 7, 2017

Tuesday, May 30, 2017

Java - escape double quotes and slashes in html

class Main {
  public static void main(String[] args) {
    String str = "<div name=\"hey\">content</div>";
System.out.println(str);
    
str = str.replace("\"", "\\\"") ;   
str = str.replace("/", "\\/") ;   

System.out.println(str);
  }
}

Thursday, May 25, 2017

Android - redirect user to play store for installing app

Intent viewIntent =
new Intent("android.intent.action.VIEW",
Uri.parse("https://play.google.com/store/apps/details?id=com.google.android.apps.ondemand.consumer"));
_mainActivity.startActivity(viewIntent);

Android - uninstall app programmatically

private void uninstallPackage(String name) {
Intent intent = new Intent(Intent.ACTION_DELETE);
intent.putExtra(Intent.EXTRA_RETURN_RESULT, true);
intent.setData(Uri.parse("package:"+name));
startActivity(intent);
}

private boolean doesPackageExist(String name){
PackageManager pm=getPackageManager();
try {
PackageInfo info=pm.getPackageInfo(name,PackageManager.GET_META_DATA);
} catch (PackageManager.NameNotFoundException e) {
return false;
}
return true;
}

private void checkUninstallStatus(int requestCode, int resultCode, Intent data) {
if (requestCode == 1) {
if (resultCode == RESULT_OK) {
Log.d("TAG", "onActivityResult: user accepted the (un)install");
} else if (resultCode == RESULT_CANCELED) {
Log.d("TAG", "onActivityResult: user canceled the (un)install");
} else if (resultCode == RESULT_FIRST_USER) {
Log.d("TAG", "onActivityResult: failed to (un)install");
}
}
}

@Override
protected void onActivityResult(final int requestCode, final int resultCode, final Intent data) {
checkUninstallStatus(requestCode,resultCode,data);
}

Tuesday, May 23, 2017

Probabilistic algorithms

https://www.safaribooksonline.com/oriole/probabilistic-data-structures-in-python

Set Cardinality - HyperLogLog
Set Membership - With False positives but no False negatives - Bloom Filter
Set similarity (document similarity) - MinHash with Jaccard
Frequency Summaries (Leaderboards in games) - Count-min sketch
Streaming Quantiles - stream of data where you have to aggregate metrics without stopping - T-Digest (like min/max/percentile)



Friday, May 19, 2017

Java print current stack trace

StringWriter sw = new StringWriter();
new Throwable().printStackTrace(new PrintWriter(sw));

//And to actually print it
Log.e("tag","Current stack trace is:\n" + sw.toString());

Friday, May 12, 2017

Tuesday, April 25, 2017

Java - iterate over a UTF-8 string

Either read the file as UTF-8 or convert it later.


public static void main(String[] args) throws Exception {

String path = "D:\\test.txt";
FileInputStream stream = new FileInputStream(new File(path));
BufferedReader br = new BufferedReader(new InputStreamReader(stream,"UTF-8"));
String str;
while ((str = br.readLine()) != null) {
System.out.println(str.length());
for(int i = 0; i< str.length(); ++i) {
System.out.println(String.format("%04x",(int)str.charAt(i)));
}
}
}

OR

BufferedReader br = new BufferedReader(new InputStreamReader(stream));
String str;
while ((str = br.readLine()) != null) {
byte[] ptext = str.getBytes(ISO_8859_1);
str = new String(ptext, UTF_8);
System.out.println(str.length());
for(int i = 0; i< str.length(); ++i) {
System.out.println(String.format("%04x",(int)str.charAt(i)));
}
}

Java get unicode point of a character

class Main {
  public static void main(String[] args) {
char ch = 'ö';
System.out.println(getUnicodePoint(ch));
ch = 'म';
System.out.println(getUnicodePoint(ch));



  }

  private static String getUnicodePoint(char ch) {
      String hex = String.format("%04x", (int) ch);
       return hex;
  }
}

Tuesday, April 11, 2017

outlook not starting


Start Outlook in safe mode to fix "Processing" screen

If Outlook stops responding at a screen that says "Processing," you can close Outlook, start it in safe mode, then close it and open it normally to fix the problem.

Close Outlook.

Launch Outlook in safe mode by choosing one of the following options.

In Windows 10, choose Start, type Outlook.exe /safe, and press Enter.

In Windows 7, choose Start, and in the Search programs and files box, type Outlook /safe, and then press Enter.

In Windows 8, on the Apps menu, choose Run, type Outlook /safe, and then choose OK.

Close Outlook, and then open it normally.

Monday, March 13, 2017

Andrew Ng machine learning

Gradient Descent vs Normal Equation
Normal equation good for smaller feature size

Normal Equation Noninvertibility
too many features (m <= n)
redundant features (linearly dependent)
pinv vs inv in Octave.
Normal equation - regularization makes X'X invertible even if it's not.

Sunday, March 5, 2017

Andrew ng clustering & PCA

randomly assign clusters
assign clusters to each instance
re compute clusters
example image compression - choose R,G,B as numerical features and
assign clusters to each point

PCA
----------
example - data compression - reduce n-dimensions to K-dimensions
co variance matrix to capture non axis aligned features' variance(spread)
reconstruct original data by same matrix - U
eigen vectors
example - image compression - choose each pixel as feature - select K
most important ones

scatter3 in octave for 3D-visualization

andrew ng collaborative filtering

recommender systems - collaborative filtering

If you know weights for movie attributes, romance, action etc, you can
learn weights for user preferences.

If you know weights for user preferences, you can compute movie attributes.

If you don't know both, start with a guess for one and compute other.
then reverse. then reverse. until it converges.

But there is another efficient approach which can solve for both together.

Thursday, March 2, 2017

mysql : database size query

SELECT table_schema "DB Name", Round(Sum(data_length + index_length) /
1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP
BY table_schema;

http://stackoverflow.com/questions/1733507/how-to-get-size-of-mysql-database

table size query

SELECT TABLE_NAME, table_rows, data_length, index_length, round(((data_length + index_length) / 1024 / 1024),2) "Size in MB" FROM information_schema.TABLES WHERE table_schema = "schema_name"

Wednesday, March 1, 2017

Octave indexing

>> tk
tk =

1 2 3
4 5 6
7 8 9

>> tk(1,:) //first row, all columns
ans =

1 2 3

>> tk(:,1) //first column, all rows
ans =

1
4
7

>> tk(1,1) //first row, first column
ans = 1
>> tk(1,2:3) //first row, second/third columns
ans =

2 3

>> tk(2:3,1) //2nd,3rd row, 1st column
ans =

4
7

>> tk(2:3) //If you omit column part, it's 1 by default. Also the result is in row format as opposed to the previous one
ans =

4 7
//flattening in row/column format
>> tk(:)
ans =

1
4
7
2
5
8
3
6
9

>> tk(:)'
ans =

1 4 7 2 5 8 3 6 9

Monday, February 27, 2017

Andrew Ng course

Online learning doesn't require learning rate configuration?

ceiling analysis - machine learning pipeline etc

Anomaly detection - Andrew Ng

Anomaly detection vs Supervised learning - when negative examples are
too few go for anamoly detection

Anamoly detection - choosing features - features should have Normal
distribution. Plot histogram and see. If not, try log(x), log(x+c),
x^0.5, x^0.2 etc. Try combination of features : CPU/Net traffic,
CPU^2/Network traffic etc

Multivariate Normal distribution - let's say memory is unusually high
for a given cpu load. But both of them individually have good enough
probability of occurring. But they are at different sides of their
respective bell curves. So we would go for multivariate Normal
distribution.

Each feature modelled independently as gaussian and multiplied is same
as multivariate Gaussian when axes are aligned, i.e. all off diagonal
components are zero.

Multivariate captures correlations between features automatically.
Otherwise you have to create those unusual features manually.

But the original model is computationally cheaper and scales with
large number of features. In MV, you have to do large matrix
operations.

In MV m > n => number of examples should be more than number of
features. Not so in original. Since you can't inverse the matrix.

In MV, the covariance matrix(sigma) should be invertible. It will not
be invertible if there are redundant features, i.e. you have duplicate
features like x2 = x1 or x3 = x4 + x5 etc.

Wednesday, February 15, 2017

neural network notes

Study back propagation and implement gradient descent.
Implement dropout.
Cross entropy is an alternative to quadratic cost function for faster learning.

Softmax is a different activation(output) function. An alternative to Sigmoid. Sum of outputs is always 1. Hence can be thought of as a probability distribution. In a Sigmoid layer, output activations won't always sum to 1.

2 good combinations in NN are : Softmax + Log likelihood cost & Sigmoid + Quadratic cost
Usually Softmax + Log Likelihood is good for multi class classification problems.

Validation_data vs. test_data

Validation_data for tuning hyper parameters like learning rate
test_data for evaluation

Avoiding overfitting
Best way to avoid over fitting is to have larger training sets.

Regularization is another way to prevent over fitting since it pushes towards smaller weights. It means small changes in inputs will yield small changes in output. If the weights are large, small changes in input may result in large changes in output. So it's helping the model avoid the effects of noise.

L2 Regularization - add weight^2 to cost
L2 Regularization - add |weight| to cost


You could train multiple Neural networks and do a voting on their results.
Similarly, there is Dropout in which you remove half the neurons at a time which helps you adjust the weights in an average way.

Expand the data set - for images add rotations/scaling/elastic distortions, for speech - vary the speed up/down, add noise

Weight Initialization
Explore Gaussian.



Vanishing Gradient
In a multi layer NN, initial layers' learning can explode or vanish - the learning rates may be too high as compared to others or too low.

Convolutional networks

  1. Local receptive fields, stride length
  2. Shared weights and biases - all neurons in a hidden layer will have same weights and biases. So that all of them can detect the same feature at different locations. They are protected against translational changes. An image shifted slightly to right or left of something is still the image of the same thing. 
  3. Map from input layers to hidden layers is called feature map. Shared weights and biases constitute a kernel/filter.
  4. One input layer can be mapped to multiple hidden layers. That enables detection of multiple features.
  5. Later layers could be pooling layers - map 2x2 inputs to one neuron/pixel.
Recurrent Neural Networks(RNNs)
  1. Output of a neuron might be determined by its earlier value. Time based. Might fit Speech and Natural language problems.
Deep Belief Networks (DBNs)
  1. Generative - not only recognize digits, but able to produce them as well.
  2. Able to do unsupervised learning too.
  3. Restricted Boltzmann machines are a key component of DBNs.

What's going on with NNs
  1. Playing video games
  2. NLP



Tuesday, February 14, 2017

Neural network notes - 2

http://neuralnetworksanddeeplearning.com/chap1.html#eqtn3

Feed forward vs Recurrent nets
Feed forward simply give output as input to the next layer
Recurrent nets can give output to previous layer and that could come back as input after some time to the same layer.
But as of now algorithms are not good enough for Recurrent.

Weights and biases/Gradient Descent

Neural networks notes 1

http://neuralnetworksanddeeplearning.com/chap1.html

Perceptrons cam compute anything since they are NAND gates. NN is a
network of perceptrons which can adjust weights and biases, hence
better than a conventional laid out circuit. Their inputs and outputs
are 0/1. Output = w.x + b where w.x is dot product of weights and
inputs and b is bias. Bias is -threshold.

But inputs to/outputs from Sigmoid neurons can be 0.683, i.e. anything
between 0 to 1. Output activation function = 1/(1 + e^(-z)) where z =
w.x + b. If we plot it, it's a smoothed version of step function or
Perceptron. Which gives it the property that small changes in inputs
result in small changes in output, unlike Perceptron. This property is
helpful in tuning of a NN, otherwise small changes in inputs will
result in significant changes down the line.

Still, Sigmoids and Perceptrons are similar in the sense that for
large z, output is 1 for small z output is 0.

Essentially, Δoutput is a linear function of the changes Δwj and Δb in
the weights and bias.

We can use other activation functions too, but σ(z) ≡ 1/(1+e^-z) is
popular since exponential has nice differential properties.

letsencrypt interfact not found

rm -rf /root/.local/share/letsencrypt
wget https://raw.githubusercontent.com/letsencrypt/letsencrypt/master/letsencrypt-auto
chmod +x letsencrypt-auto
./letsencrypt-auto --debug renew

exponential function and e

f'(e^x) = e^x, rate of change at any x = e^x
integration(1/x) = ln(x), i.e. area under the curve from x = 1 to x =
k is ln(k), for k = e, area is 1.
slope at x=0 of e^x is 1
compound interest, (1 + 1/n)^n = e as n approaches infinity
e = sigma(1/fact(n))
e^x = sigma(similar)

Sunday, February 12, 2017

Numpy vs Tensorflow Matrix multiplication

Numpy is much much faster (Note: Used CPU version, not the GPU version)
import numpy
import tensorflow as tf
import time

def getTestData():
    A = [[1., 2., 3., 4.],[3.,4.,5.,6.],[7.,8,9.,10.],[11.,12.,13.,14.]]
    return 6,A

def tfMatMul():
    n,A = getTestData()
    A = tf.constant(A)
    sess = tf.Session()
    for num in range(1,n):
        A = tf.matmul(A,A)
        output = sess.run(A)
        A = tf.convert_to_tensor(output)
    sess.close()
    return output

def numPyMatMul():
    n,A = getTestData()
    for num in range(1,n):
        A = numpy.matmul(A,A)
    return A


def timedRun(methodToRun):
    start = time.time()
    result = methodToRun()
    end = time.time()
    diff = end - start
    print("Time Taken :"+str(diff))
    print(result)

timedRun(numPyMatMul)
timedRun(tfMatMul)

Saturday, February 11, 2017

conda commands

conda create --name test35 python=3.5
conda info --envs (list)
activate test35
deactivate test35
conda remove --name test35 --all
conda install -y scipy

PyCharm with Anaconda - Using a specific environment

Search for "Project Interpreter"
Click on Settings Icon(Wheel)
Add Local
Choose python.exe in your env, path is like Anaconda3\envs\<env_name>\python.exe

Thursday, February 9, 2017

scala notes

https://www.safaribooksonline.com/library/view/scala-for-the/9780134510613/

for yield => map
for yield with guard clause => filter
reduceLeft

clojures - how are they implemented? In scala, they are objects which capture the method and bindings of free variables.

Expression evalutation (Recursive data structures)
abstract class Expr
case class Num(value: Int) extends Expr
case class Sum(left: Expr, right: Expr) extends Expr
case class Product(left: Expr, right: Expr) extends Expr

val e = Product(Num(3), Sum(Num(4), Num(5)))
def eval(e: Expr):Int = e match {
  case Num(v) => v
  case Sum(l,r) => eval(l) + eval(r)
  case Product(l,r) => eval(l) * eval(r)
}
eval(e)

Expression evalutation (Recursive data structures) - OOP version
abstract class Expr {
  def eval: Int
}

class Num(val data: Int) extends Expr {
  def eval: Int = data
}
class Product(val left: Expr, val right: Expr) extends Expr {
  def eval: Int = left.eval * right.eval
}
class Sum(val left: Expr, val right: Expr) extends Expr {
 def eval: Int = left.eval + right.eval
}

val e = new Product(new Num(3), new Sum(new Num(4), new Num(5)))
e.eval
So what to use, Polymorphism version or case classes.
Use case classes when your cases are bound. Like here. There is a finite set of expressions.
Use Polymorphism 


Wednesday, February 8, 2017

scala notes

https://www.safaribooksonline.com/library/view/scala-for-the/9780134510613/

packages => nesting can be all at one place, no need to have similar source directories
imports are flexible, can import specific classes, can hide a class, can alias, can import anywhere (lexical scoping)

Traits (like Java interfaces)
But much more powerful

Traits cannot have construction parameters, otherwise they are same as classes.
traits can be mixed in with objects, rather than class declaration
traits can invoke others in a priorlayer (consolelogger, timestamplogger, shortlogger)

Thursday, February 2, 2017

Changing the port for react app(create-react-app)

Built with create react app

Edit node_modules/react-scripts/scripts/start.js

Search for DEFAULT_PORT and modify as follows:

var DEFAULT_PORT = process.env.PORT || 80;

Tuesday, January 31, 2017

Redis (Cluster) notes

Benchmark tool
Comes with Redis. Very good with variety of options. 

Server on AWS  (client on my local machine)
PS C:\Users\user> redis-benchmark -t set -n 100000 -h <MY_AWS_FREE_TIER_IP> -p 81
====== SET ======
  100000 requests completed in 40.93 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

0.00% <= 14 milliseconds
0.22% <= 15 milliseconds
10.06% <= 16 milliseconds
24.73% <= 17 milliseconds
30.37% <= 18 milliseconds
32.67% <= 19 milliseconds
43.62% <= 20 milliseconds
63.83% <= 21 milliseconds
71.76% <= 22 milliseconds
74.68% <= 23 milliseconds
79.34% <= 24 milliseconds
89.36% <= 25 milliseconds
95.18% <= 26 milliseconds
97.42% <= 27 milliseconds
98.39% <= 28 milliseconds
98.87% <= 29 milliseconds
99.11% <= 30 milliseconds
99.36% <= 31 milliseconds
99.54% <= 32 milliseconds
99.65% <= 33 milliseconds
99.72% <= 34 milliseconds
99.80% <= 35 milliseconds
99.85% <= 36 milliseconds
99.89% <= 37 milliseconds
99.91% <= 38 milliseconds
99.93% <= 39 milliseconds
99.95% <= 40 milliseconds
99.96% <= 41 milliseconds
99.97% <= 42 milliseconds
99.98% <= 44 milliseconds
99.98% <= 45 milliseconds
100.00% <= 46 milliseconds
100.00% <= 50 milliseconds
100.00% <= 50 milliseconds
2443.38 requests per second

Localhost: (both client and server)
PS C:\Users\user> redis-benchmark -t set -n 100000
====== SET ======
  100000 requests completed in 1.42 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

89.97% <= 1 milliseconds
99.88% <= 2 milliseconds
99.91% <= 3 milliseconds
99.93% <= 4 milliseconds
99.95% <= 8 milliseconds
99.96% <= 9 milliseconds
99.98% <= 10 milliseconds
99.99% <= 11 milliseconds
99.99% <= 12 milliseconds
99.99% <= 13 milliseconds
99.99% <= 14 milliseconds
99.99% <= 15 milliseconds
100.00% <= 16 milliseconds
100.00% <= 17 milliseconds
100.00% <= 18 milliseconds
100.00% <= 19 milliseconds
70521.86 requests per second

With pipelining (QPS is much higher but latency for 90 percentile requests is much higher)
PS C:\Users\user> redis-benchmark -t set -n 100000 -P 100
====== SET ======
  100000 requests completed in 0.21 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

0.00% <= 1 milliseconds
0.60% <= 2 milliseconds
1.00% <= 3 milliseconds
1.10% <= 4 milliseconds
1.30% <= 5 milliseconds
1.40% <= 6 milliseconds
10.70% <= 7 milliseconds
35.80% <= 8 milliseconds
46.70% <= 9 milliseconds
57.00% <= 10 milliseconds
69.70% <= 11 milliseconds
79.10% <= 12 milliseconds
86.60% <= 13 milliseconds
91.70% <= 14 milliseconds
94.00% <= 15 milliseconds
97.00% <= 16 milliseconds
99.30% <= 17 milliseconds
100.00% <= 17 milliseconds
467289.72 requests per second

  1. Redis is single threaded, can fork another thread(process?) for persistence.
  2. If running in cluster mode, one node should have n/2 instances (master + slave) where n = NUM_CORE since one process for serving commands, another for persistence
  3. RDB vs AOF persistence
  4. Slave can be configured to become master if master hasn't been contacted in a while.
  5. Total ~16K Hashslots.
  6. Resharding (redistribution of keys) is always manual, whether adding a node or deleting one. Failover is automatic since slave already has the the same keys as master.
  7. Pipelining will increase throughput but 90-95 percentile latency will be very high. Essentially 100 percentile latency will be lower than non-pipeline version but for other percentiles it will be very high.
  8. Others : Recently Geo commands were added. Though tiles38 is also there for the similar stuff.

Forgot alias for android keystore file

keytool -v -list -keystore file.jks

keytool is in jre/bin

Sunday, January 29, 2017

Using Apache as forward proxy

<VirtualHost *:80>
  ServerName proxy.yourdomain.com
  ProxyRequests On
  SSLProxyEngine On

  ProxyPass        /revoke https://myca.com/revoke
  ProxyPassReverse /revoke https://myca.com/revoke

  <Location />
    Order Deny,Allow
    Allow from all
  </Location>
</VirtualHost>

using apache/nginx as reverse proxy server (map to different ports based on domain name)

Assuming nginx is running on port 82 and you want to serve nginx.domain.com with nginx.


<VirtualHost *:80>
    ServerAdmin me@mydomain.com
    ServerName nginx.domain.com
    ProxyPreserveHost On

    # setup the proxy
    <Proxy *>
        Order allow,deny
        Allow from all
    </Proxy>
    ProxyPass / http://localhost:82/
    ProxyPassReverse / http://localhost:82/
</VirtualHost>

Blog Archive