Hackers and Slackers
  • About
  • Series
  • Search
  • Join
  • Donate
  • Log in
  • Subscribe
  • Latest
  • Python
  • Software
  • DevOps
  • Data Engineering
  • Architecture
  • Pandas
  • Excel
  • Data Analysis
  • REST APIs
  • SQL
  • Data Science
  • Flask
  • JavaScript
  • Code Snippet Corner
  • AWS
  • NodeJS

Data Engineering

Collect and transform data on a large scale. Build data pipelines, work with a horizontally scalable architecture, or simply scrape and collect data.
Create Google BigQuery Tables via the Python SDK

Create Google BigQuery Tables via the Python SDK

Use Google Cloud's Python SDK to insert large datasets into Google BigQuery, enjoy the benefits of schema detection, and manipulating data programmatically.
Todd Birchard
Todd Birchard
Feb 8, 2021 • 11 min read
Google Cloud
Scrape Structured Data with Python and Extruct

Scrape Structured Data with Python and Extruct

Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.
Todd Birchard
Todd Birchard
Jun 29, 2020 • 13 min read
Python
Simplify BigQuery ETL jobs using SQLAlchemy

Simplify BigQuery ETL jobs using SQLAlchemy

Extract and move data between BigQuery and relational databases using PyBigQuery: a connector for SQLAlchemy.
Todd Birchard
Todd Birchard
Nov 16, 2019 • 8 min read
Data Warehouses
Using Amazon Redshift as your Data Warehouse

Using Amazon Redshift as your Data Warehouse

Get the most out of Redshift by performance tuning your cluster and learning how to query your data optimally.
Todd Birchard
Todd Birchard
Jul 29, 2019 • 12 min read
Data Warehouses
Join and Aggregate PySpark DataFrames

Join and Aggregate PySpark DataFrames

Perform SQL-like joins and aggregations on your PySpark DataFrames.
Todd Birchard
Todd Birchard
Jun 24, 2019 • 7 min read
Spark
Working with PySpark RDDs

Working with PySpark RDDs

Working with Spark's original data structure API: Resilient Distributed Datasets.
Todd Birchard
Todd Birchard
Jun 6, 2019 • 8 min read
Spark
Manage Data Pipelines with Apache Airflow

Manage Data Pipelines with Apache Airflow

Use Apache Airflow to build and monitor better data pipelines.
Todd Birchard
Todd Birchard
Jun 3, 2019 • 13 min read
Apache
Structured Streaming in PySpark

Structured Streaming in PySpark

Become familiar with building a structured stream in PySpark using the Databricks interface.
Todd Birchard
Todd Birchard
May 13, 2019 • 8 min read
Spark
Becoming Familiar with Apache Kafka and Message Queues

Becoming Familiar with Apache Kafka and Message Queues

Getting to know Apache Kafka: a horizontally scalable event streaming platform. Learn what makes Kafka critical to high-volume low-latency data pipelines.
Todd Birchard
Todd Birchard
May 4, 2019 • 6 min read
Apache
Cleaning PySpark DataFrames

Cleaning PySpark DataFrames

Easy DataFrame cleaning techniques ranging from dropping rows to selecting important data.
Todd Birchard
Todd Birchard
Apr 27, 2019 • 18 min read
Spark
Transforming PySpark DataFrames

Transforming PySpark DataFrames

Apply transformations to PySpark DataFrames such as creating new columns, filtering rows, or modifying string & number values.
Todd Birchard
Todd Birchard
Apr 26, 2019 • 15 min read
Spark
Learning Apache Spark with PySpark & Databricks

Learning Apache Spark with PySpark & Databricks

Get started with Apache Spark in part 1 of our series, where we leverage Databricks and PySpark.
Todd Birchard
Todd Birchard
Apr 25, 2019 • 13 min read
Spark
Building an ETL Pipeline: From JIRA's REST API to SQL

Building an ETL Pipeline: From JIRA's REST API to SQL

Build a pipeline which extracts raw data from the JIRA's Cloud API, transforms it, and loads the data into a SQL database.
Todd Birchard
Todd Birchard
Mar 28, 2019 • 11 min read
Data Engineering
Working With GraphQL Fragments and Mutations

Working With GraphQL Fragments and Mutations

Make your GraphQL queries more dynamic with Fragments, plus get started with Mutations.
Todd Birchard
Todd Birchard
Mar 19, 2019 • 5 min read
GraphQL
Building a Client For Your GraphQL API

Building a Client For Your GraphQL API

Now that we have an understanding of GraphQL queries and API setup, it's time to get that data.
Todd Birchard
Todd Birchard
Mar 9, 2019 • 6 min read
GraphQL
Writing Your First GraphQL Query

Writing Your First GraphQL Query

Begin to structure complex queries against your GraphQL API.
Todd Birchard
Todd Birchard
Mar 7, 2019 • 8 min read
GraphQL
Welcome to SQL: Modifying Databases and Tables

Welcome to SQL: Modifying Databases and Tables

Brush up on SQL fundamentals such as creating tables, schemas, and views.
Todd Birchard
Todd Birchard
Feb 19, 2019 • 10 min read
SQL
Downcast Numerical Data Types with Pandas

Downcast Numerical Data Types with Pandas

Using an Example Where We Downcast Numerical Columns.
Matthew Alhonte
Matthew Alhonte
Jan 28, 2019 • 5 min read
Code Snippet Corner
From CSVs to Tables: Infer Data Types From Raw Spreadsheets

From CSVs to Tables: Infer Data Types From Raw Spreadsheets

The quest to never explicitly set a table schema ever again.
Todd Birchard
Todd Birchard
Jan 23, 2019 • 9 min read
Big Data
Psycopg2: PostgreSQL & Python the Old Fashioned Way

Psycopg2: PostgreSQL & Python the Old Fashioned Way

Connect to a PostgreSQL database and execute queries in Python using the Psycopg2 library.
Todd Birchard
Todd Birchard
Jan 15, 2019 • 8 min read
PostgreSQL
Databases in Python Made Easy with SQLAlchemy

Databases in Python Made Easy with SQLAlchemy

Leverage the iconic SQLAlchemy Python library to effortlessly handle database connections and queries in software.
Todd Birchard
Todd Birchard
Jan 9, 2019 • 10 min read
SQLAlchemy
MongoDB Stitch Serverless Functions

MongoDB Stitch Serverless Functions

A crash course in MongoDB Stitch serverless functions: the bread and butter of MongoDB Cloud.
Todd Birchard
Todd Birchard
Nov 26, 2018 • 6 min read
NoSQL
Scraping Data on the Web with BeautifulSoup

Scraping Data on the Web with BeautifulSoup

Use Python's BeautifulSoup library to assist in the honest act of systematically stealing data without permission.
Todd Birchard
Todd Birchard
Nov 11, 2018 • 13 min read
Python
Create a REST API Endpoint Using AWS Lambda

Create a REST API Endpoint Using AWS Lambda

Create an AWS Lambda function to pull records from a database.
Todd Birchard
Todd Birchard
Oct 29, 2018 • 4 min read
AWS
Working With Google Cloud Functions

Working With Google Cloud Functions

GCP scores a victory by trivializing serverless functions.
Todd Birchard
Todd Birchard
Oct 18, 2018 • 6 min read
Google Cloud

Tags

Python Software DevOps Architecture Data Engineering Pandas Excel Data Analysis Data Science REST APIs SQL Code Snippet Corner JavaScript Flask AWS NodeJS Google Cloud Apache Frontend MySQL Data Vis NoSQL BI ExpressJS Spark PostgreSQL GraphQL ETL Pipelines Tableau GatsbyJS Atlassian SQLAlchemy Automation Machine Learning Big Data Golang Mapbox Scraping JAMStack Data Warehouses Powerpivot PowerBI Plotly Concurrency Docker SaaS Products Django Hashicorp ReactJS Frameworks FastAPI Terraform Java

Newsletter

Join the newsletter to receive the latest updates in your inbox.

Please check your inbox and click the link to confirm your subscription.
Please enter a valid email address!
An error occurred, please try again later.

Series'

Data Analysis with Pandas 11
Build Flask Apps 11
Google Cloud Architecture 6
Learning Apache Spark 6
Mastering SQLAlchemy 4
GraphQL Tutorials 4
Welcome to SQL 4
Working with MySQL 4
Mapping Data with Mapbox 3
Python Concurrency with Asyncio 2
Web Scraping With Python 2
Getting Started with Django 2
Hackers and Slackers

Community of hackers obsessed with data science, data engineering, and analysis. Openly pushing a pro-robot agenda.

Navigation

    • About
    • Series
    • Search
    • Join
    • Donate

Series'

  • Build Flask Apps
  • Data Analysis with Pandas
  • Learning Apache Spark
  • Google Cloud Architecture
  • Create a REST API in AWS
  • GraphQL Tutorials
  • Mastering SQLAlchemy
  • Hacking Tableau Server
  • Working with MySQL
  • Welcome to SQL
  • Mapping Data with Mapbox
  • Python Concurrency with Asyncio
  • Web Scraping With Python
  • Getting Started with Django

Authors

  • Todd Birchard
  • Matthew Alhonte
  • Max Mileaf
  • Ryan Rosado
  • Graham Beckley
  • David Aquino
  • Paul Armstrong
  • Dylan Castillo
2022 Hackers and Slackers