Elastic Search is great but the API syntax is a little weird. So, I noted down some important APIs I came across in the book Elasticsearch: The Definitive Guide.

Elastic Search is a horizontally scalable search engine with support for near real time search. Here, we will go over how this Elastic Search does it.

Python packaging Being new to Python and PySpark, and had to test PySpark feasibility on old Hortanworks Data Platform (HDP) cluster, I had many questions. Having worked on Java, Spark I was expecting similar workflow for how we would run the PySpark application on the cluster. I assumed there would...

Problem Statement: Given a file of text, output what all words appear in the file and how many times.

Spark is distributed data processing framework used in Big Data world to process a big amount of data in a distributed way to process it parallelly and so faster.