Playing with Spark & Jupyter Notebooks Locally

Don’t want to pay to play? BYON — Bring Your Own Notebook! Or how I run Spark, Hadoop, Hive, and Jupyter locally.

Og Ramos
3 min readNov 26, 2022
Photo by Christopher Gower on Unsplash

👋 ☕️ 🌅

Good morning everyone! How is everyone doing this fine Saturday Morning? Ready for a quick chat? Hmm? Yes? Good!

First, I hope you are doing fine after two days of craziness. Each Thanksgiving is the same for me. Food Comma -> Burn calories running around buying Christmas gifts on Black Friday -> Saturday Morning Regret followed with a dash of black coffee. 😃

Anyway, where was I? Ah! Yes. So, recently I’ve been playing around with Databricks after a long hiatus from using that tool. But, when I wanted to use my Community version, I saw that I was incurring costs. A whopping $3.00! I know, it’s not that much. But I don’t want to pay anything for it.

So, I researched and found that most of what I need can be obtained through Docker Hub.

I don’t want to take any credit that isn’t mine, so I’m going to link to the git repository where I got my data layer:

--

--