Why You Must Learn SQL in Data-Driven Era?

Ari Sulistiyo Prabowo
8 min readFeb 18, 2021

The company is looking for people who have knowledge of data to drive the company

Image by William Iven from Unsplash

The Goal of this Article:

  1. Understanding structured and unstructured data
  2. Fundamental SQL (Structured Query Language)
  3. Setting up your local database with DBeaver and PostgreSQL
  4. Mini practice to start querying

The Growth of Data in the Industry

In the era of technology and big data, companies are so massive to find people with the knowledge of data to drive the company to achieve its goals. Moreover, the data is increasing every time and change into big data. So, what is big data?

Big data is a high-volume, high-velocity, and high-variety (what we call 3V) information assets that enable enhanced insight, decision making, and process automation [Gartner]

The massive of data makes the companies want to detract valuable information in order to catch the opportunity in the market. Because data can impart everything whether the particular team grows slowly or fastly. Data itself are so wide to dive deep such as data in marketing, tracking users behavior, procurement, spending of ads, and many more. Therefore, the companies want to drive themselves based on data-driven.

Types of Data

Actually, there are types of data that you have to know in a real experience such as structured data and unstructured data. Structured data is a standardized format that is usually in a table with columns and rows. Here is structured data

Structured Data

On the other hand, unstructured data is not arranged to a pre-set model or schema therefore it can not be stored in a table like structured data. Mostly, the unstructured data contains multimedia and text data. Here is unstructured data

Unstructured Data

All of these data (structured and unstructured data) are stored in the database. In the database, people can pull the data, analyze the data, and communicate the data to stakeholders to detract from the insights of the data. Most people say for the structured database as SQL (Structured Query Language) and unstructured database as NoSQL. In this article, I will talk only about SQL. Here I also provide the difference between SQL and NoSQL to make it clear.

Fundamental of SQL

Before we go further into practice, let’s talk about the fundamental of SQL. In SQL, the data storage model contains a table with fixed rows and columns. So, in order to get the data, you just call certain columns and certain values based on the request from users. Here is the terminology of SQL.

SQL Terminology — IYKRA

There are three components in SQL you must know:
1. Data Definition Language (DDL)
2. Data Manipulation Language (DML)
3. Data Control Language (DCL)

Data Definition Language (DDL)

In the context of SQL, data definition language consists of the SQL commands to create and design the database schema. It just simply define the description of the schema, create and modify the structure of the database. Here are some commands in DDL:
- CREATE → to create a table in the database
- ALTER → to add or remove columns in the database
- DROP → to remove the table from the database

Data Manipulation Language (DML)

Data manipulation language is used to manipulate the data in the database. Here are some commands in DML:
- SELECT → to extract data from the database
- INSERT → to insert new data from the database
- UPDATE → to update the existing data within the table
- DELETE → to deleted data from the database

Here some commands of DML that usually use in SQL

DML syntax — IYKRA

Data Control Language (DCL)

Data control language mainly deals with the rights, permission, and other controls of the database system that includes commands such as GRANT and REVOKE.
- GRANT → to give user’s access privileges to the database
- REVOKE → to withdraw the user’s access from the database

How to Start Learning SQL

I believe the theory of SQL is enough for you, so let’s set up the environment and get your hands dirty.

Preparing the tools

As a beginner, you just prepare two tools in your local computer to set up the database environment such as Dbeaver and PostgreSQL.

DBeaver is an open-source universal database tool for developers and database administrators. You can directly download the DBever here and install it on your local computer.

PostgreSQL is an open-source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads.

Setting PostgreSQL

During the installation of PostgreSQL, you have to remember the password to connect the database, just keep the default Port (usually 5432) and default locale. Once the installation is finished, here you should do.
1. Open the PostgreSQL and you fill in your password
2. Create a new database
3. Name your database, for example *cars* and save
4. New database has been created

2. Create a new database
3. Name your database

Once you save the database, now you have a new database called *cars*

A new database called cars

Setting DBeaver

Once you finish installing the DBeaver, you will see the interface of DBeaver like this

The first interface of DBeaver

Before querying data inside DBeaver or connect the data into PostgreSQL, you must follow these steps:
1. Connect the new database and choose PostgreSQL
2. Now you will see the connection settings and setup server.
3. Create a new SQL script and download the cars dataset here and create the table
4. Import the dataset and finish

The first interface of DBeaver

Connect your DBeaver into PostgreSQL by clicking the red square that I mark and choose PostgreSQL and click the next button.

1. Connect DBeaver to PostgreSQL

In the second step, you first set up the Database into the name of the database in your PostgreSQL that is made before which is *cars*, fill the password same as your password in PostgreSQL and click finish.

2. Set up the connection settings

Well done, now you have a database in your DBeaver platform. Next, you need to download the dataset here and create the table in your database.

3. Create a new SQL script

Now you will use commands to create a table from DDL. Next, you open the CSV file to see the column names and write them based on their data types such as integer, varchar, float, and boolean. You may learn the definition of each data type here. After writing the script, you must click the play button that I mark with a red square to run the script.

3. Create a table

The table has been set up and the following step is you must import the dataset into the table. From the car table inside the cars database, right-click and choose import data.

4. Import the data

During the steps to import your data, you will be faced with the below settings. You see the column delimiter with, (comma) and quote char with (punctuation). Commonly, the CSV file with a comma and single punctuation, but it is better to see the CSV file first.

4. Importer Settings

Mini Practice from cars data

Congratulations, you now finish all the steps and start querying data from your local database using PostgreSQL. After this, you open pgAdmin → fill your password → go to cars database → go to Tables → right click and choose query tool and the last is to write the commands in the toolbox SELECT * FROM car

The interface of your output

Setting the database environment is nothing if you don’t practice directly with your local real experience in PostgreSQL.

Practice makes you perfect

Now, I want you to answer these questions and try to match my answers. I hope you can answer all that questions. Happy querying :)
1. How many data in cars dataset → 2499
2. How many brands of cars → 28
3. What is the maximum price → 84900
4. What brands exist between 1995 and 2005 → bmw, cadilac, chevrolet, chrysler, jeep, ford.

Bonus Information

SQL Join

SQL Join — Javarevisit

Restore *tar* data in your PostgreSQL

The tar data contains several tables that are useful for you to get your hands dirty with SQL Join. You may download the data with tar extension here. The steps you have to do:
1. Create a new database in PostgreSQL and set the name dvd_rental (you can set by your own)
2. Right-click in your new database and click restore
3. In the filename box → click the three dots → choose the format all files → choose dvdrental.tar → click select
4. Go to the dvd_rental database → choose schema → right click on the table → choose query tool.

Voila, you can query and do SQL join to like this

SQL Join

Conclusion

Once again I would like to say congratulations to you that you not only set up your own database from scratch but you also learn the fundamental of SQL. I am not saying that SQL is easy, you will find some aggregations and other manipulation methods in order to fulfill the demand of your users in the real project. After this, I want you to explore more by yourself and never stop learning. That’s all from me, if you want to connect with me, you can find me through LinkedIn and if you think this article is useful for everyone please share. Thank you all :)

--

--