Euclidean Distance: In order to calculate the distance b/w tow points in a given Space – can be a N-dimensional, we can use Euclidean distance formula: D(x1,x2) = SQRT((x1-x2)**2 + (y1-y2)**2) where SQRT is Square Root D stands for distance b/w point x1,x2. we can extend this for N- dimensional plan as follows D(x1,x2,x3,…xn) =Continue reading “Euclidean vs Manhattan Distance”
Author Archives: Uday Boni
Scala vs Python
List some of the advantages of using Scala, as opposed to other languages. In order to explain this, I am comparing Scala vs Python. If you want to write some serious Apache Spark programming it is better to choose Scala because of following reasons. Note: Following comparisons are based on the fact that I amContinue reading “Scala vs Python”
Apache Spark
Describe what are Accumulators and Broadcast Variables in Spark and when to use these shared variables? Accumulators & Broadcast Variables: Both Accumulators and Broadcast variables are considered as Shared variables in spark. Meaning Spark will allow both worker nodes and Driver program to mutually access the values of these variable while processing the data inContinue reading “Apache Spark”
Sentiment Analysis
Good Article on SA use case https://ieeexplore.ieee.org/document/8970492
The Turing Test
CAP Theorem
Eric Brewer, systems professor at the University of California, Berkeley, brought the different trade-offs together in a keynote address to the Principles of Distributed Computing (PODC) conference in 2000.He presented the CAP theorem. The CAP Theorem is a fundamental theorem in distributed systems that states any distributed system can have at most two of theContinue reading “CAP Theorem”
Briefly define transparency, fault tolerance, scalability, and naming. Discuss these concepts in the context of HDFS architecture and how some of them you think change in the cluster-oriented architecture.
Transparency: The term Transparency in distributed systems implies that the end user should not face any difficulty when accessing the remote files, the same way as local files. The user should be able to access the files from any system along as the system is part of distributed system. The client/user is not bothered ofContinue reading “Briefly define transparency, fault tolerance, scalability, and naming. Discuss these concepts in the context of HDFS architecture and how some of them you think change in the cluster-oriented architecture.”
NO SQL Databases
MongoDB: This is one of the NoSQL DB. It’s a NoSQL solution. This is also called as Document Database. Basic architecture is as follows: MongoDB will have multiple databases; Each Database will have multiple Collections; Each collections will have set of documents. This is where data will be stored in case of MongoDB MongoDB storesContinue reading “NO SQL Databases”
About myself
My full name is Uday Sankar Boni and I am a technology savvy. I love working on different technologies which I come across everyday in my professional world. One of the primary reasons why I started creating my own blog is to share my little knowledge in the world of technology. Hope people enjoy whatContinue reading “About myself”