Essential Linux for Cloud (Ser. 2)

·

20 min read

What is Linux?

  • Linux is an open-source operating system that is based on the Unix operating system.

  • It is known for its stability, security, and flexibility, and is widely used in servers, supercomputers, and mobile devices.

  • Linux is also popular among developers and power users due to its command-line interface and the ability to customize and configure the system to their specific needs.

Linux Usecases

There are many devices that run Linux, including:

a) Servers: Many web servers and cloud servers run on Linux.

b) Supercomputers: Some of the world's most powerful supercomputers run on Linux.

c) Desktops and Laptops: Many people use Linux as their primary operating system on their personal computers, as Linux distributions such as Ubuntu, Fedora, and Debian are popular alternatives to Windows and macOS.

d) Mobile devices: Some smartphones and tablets run on Linux-based operating systems, such as Android.

e) Network devices: Many routers, switches, and other networking equipment run on Linux.

f) Embedded systems: Linux is widely used in embedded systems such as smart TVs, digital signs, and industrial control systems.

g) IoT devices: Many Internet of Things (loT) devices run on Linux, including smart home devices and industrial control systems.

h) Automotive: Some car models have Linux based infotainment systems.


Linux Architecture

Linux Operating System

Overview of each components:

  1. Hardware

    • This refers to the physical components of a computer system, such as the CPU, memory, storage devices, and peripherals (e.g., keyboard, mouse, display).

    • Linux is designed to run on a wide range of hardware architectures, from small embedded devices to large servers.

  2. Kernel Modules:

    • Kernel modules are pieces of code that can be loaded into the Linux kernel at runtime to extend its functionality or add support for new hardware devices.

    • They can be loaded and unloaded dynamically, without requiring a reboot. Kernel modules are used to add device drivers, file systems, network protocols, and other features to the kernel, allowing Linux to support a wide range of hardware and software configurations.

  3. Kernels

    • The Linux kernel is the core component of the operating system.

    • It is responsible for managing the system's resources, such as the CPU, memory, and devices.

    • The kernel provides the foundation for all other components of the operating system and interacts directly with the hardware.

  4. System Libraries

    • System libraries are collections of precompiled routines and functions that provide an interface between applications and the operating system.

    • They abstract low-level details and provide a standardized way for applications to access system resources and services.

    • Examples of system libraries in Linux include the GNU C Library (glibc), which provides functions for file I/O, memory management, and networking.

  5. System Call Interface Layer

    • The system call interface layer is part of the kernel that provides a way for user-space applications to request services from the kernel.

    • System calls are the mechanism by which user-space applications can interact with the kernel and access resources such as files, processes, and devices.

  6. User-Space Application Layer

    • This layer includes all the user-space applications and processes running on top of the kernel.

    • User-space applications communicate with the kernel through system calls and rely on system libraries to provide access to operating system services.


Linux Distributions

Linux distributions, also known as "distros," are different versions of the Linux operating system that are built and packaged with a specific set of software and features.

Some popular examples of Linux distributions include:

  • Ubuntu: Ubuntu is a popular and user-friendly distribution that is widely used on desktops and servers. It is known for its regular releases and long-term support (LTS) versions.

    Ubuntu new logo

  • Fedora: Fedora is a community-driven distribution that is sponsored by Red Hat. It is known for its cutting-edge software and focus on open-source principles.

  • Debian: Debian is a stable and reliable distribution that is widely used on servers and in embedded systems. It is known for its large software repository and package management system.

  • Arch Linux: Arch Linux is a lightweight and flexible distribution that is popular among experienced Linux users. It is known for its rolling release model and minimalistic design.

  • Linux Mint: Linux Mint is a community-driven distribution that is based on Ubuntu. It is known for its ease of use and attractive interface.

    Linux Mint logo without wordmark

  • Gentoo: Gentoo is a flexible and customizable distribution that is popular among experienced Linux users. It is known for its Portage package management system and ability to build packages from source.

    Gentoo Linux logo matte

  • Red Hat Enterprise Linux (RHEL): RHEL is a commercial distribution that is widely used in enterprise environments. It is known for its stability, security, and support for enterprise-level features.

    Red Hat Enterprise Linux logo


Linux Folder Structure

A high-level overview, and the actual structure may vary slightly depending on the Linux distribution.

Each directory serves a specific purpose in organizing and managing the files and data on a Linux system.

  1. / (Root Directory): The top-most level in the file system hierarchy. Everything starts from here.

  2. /bin (Binary Executables): Contains essential binary executables (i.e., compiled programs) that are necessary for the system to boot and run. Common commands like ls, cp, mv, rm are located here.

  3. /sbin (System Binaries): Contains binary executables that are essential for system administration, such as ifconfig and fdisk.

  4. /etc (Configuration Files): Contains configuration files for the system and applications. You'll find important system configuration files like /etc/fstab, /etc/passwd, and /etc/nginx/nginx.conf here.

  5. /usr (User Programs): Contains user-space programs, libraries, documentation, and other files not needed for booting. It's like the "Program Files" directory in Windows.

  6. /var (Variable Data): Contains variable data, such as log files (/var/log) and spool directories (/var/spool).

  7. /lib (Libraries): Contains libraries (collections of code) that are used by programs in /bin and /sbin.

  8. /home (Home Directories): Contains the home directories of users on the system. Each user has a subdirectory here with their name (e.g., /home/john).

  9. /tmp (Temporary Files): Contains temporary files that can be deleted by the system when space is needed. Users can also use this directory for temporary storage.

  10. /dev (Device Files): Contains device files that represent the various hardware devices on the system, such as hard drives (/dev/sda) and USB drives (/dev/sdb).

  11. /proc (Process Information): A virtual file system that contains information about the system's process and kernel. It's not an actual disk file system but a way to communicate with the kernel.

  12. /sys (System Information): A virtual file system that provides an interface to the kernel's device drivers and other kernel internals. It's similar to /proc but focuses on system configuration rather than process information.


File Management

File management in Linux refers to the process of creating, organizing, and manipulating files and directories in the Linux file system. This includes tasks such as creating new files and directories, copying, moving, and deleting files, and managing permissions and ownership of files.

There are several basic commands that are commonly used for file management in Linux:

a)ls: Lists the files and directories in a directory.

ls -l : Provides a detailed listing that includes information such as file permissions, number of hard links, owner, group, size, and last modified date and time.

ls -la : Used to list all files (including hidden files) in a directory along with detailed information about each file.

b)pwd: Prints the current working directory.

c)cd: Changes the current working directory.

cd ~ : Used to change the current directory to the home directory of the current user.

cd / : Change the current directory to the root directory.

d)mkdir: Create a new directory.

e)touch: Create an empty file or update the access and modification times of a file.

f) Text editors

There are multiple text editors like nano, vim and gedit. I have nano and gedit on my machine. You can add the text editors you like.

Also, I found gedit as much easier to navigate and has written some text. Now, time to see what's that.

g)cat: This command can display the entire contents of a file.

h)mv: This command renames the files and folders.

There are rename and mv command for that.

mv <old name> <new name>

i)cd..: This command is used to change the current directory to the parent directory (the directory one level up in the directory hierarchy)

j)mv: Example to demonstrate renaming a folder.

We can also use mv command for moving purpose.
Here, we have moved a file named new.txt to SecondDir directory.

k)cp: Copies a file and cp -r for directories.

Here, we have yo.txt file only. We're copying the contents to a new file new.txt. Hence, we have two files having same contents in them.

Here, we make a directory called NewDir, and copied yo.txt file into it.

cp -r <One Directory> <New Directory/>

Here, we have created a new directory called SecondDir and we copied the NewDir directory to that newly created SecondDir directory. We copy all the contents of a directory while doing this.

l) Removing files and directory

We can use rm <file name> to remove files.

We can use rm -i <file name> to make it a bit interactive.

Now, let's see directory case.

Use rmdir command to delete an empty directory and use rm -r <directory_name> to delete directory that has contents.

rm -rf * : Recursively delete all files and subdirectories in the current directory without asking for confirmation.

( In Linux, "recursively" refers to the ability of a command to apply to a directory and all of its contents, including subdirectories and their contents, and so on.)

Always double-check your command and ensure you are in the correct directory before running such a command.

m) Exploring the vim text editor

We can create a new file named *** and open it in editing mode with the vim command by running:

Click on i button to enter into insert mode.

Here you can add text and to save, do Ctrl+C to exit insert mode.

Type :wq! to save and exit the editor.

We can see the contents of a file with cat command.

m) Exploring file and directory informations

total 8 : The "total" value shown by is a sum of the sizes of the directory entries, not the actual disk space used by the directory. Here, total 8 means that the total disk allocation for the files in that directory is 8 blocks. Each block typically represents 1 kilobyte (1024 bytes) of disk space.

drwxrwxr-x  2 sujan sujan 4096 Apr 23 06:06 .

drwxrwxr-x: This indicates the file type and permissions of the entry.

    • d: Indicates that this is a directory.

      * rwx: The owner (sujan in this case) has read, write, and execute permissions.

      * rwx: The group (sujan group in this case) has read, write, and execute permissions.

      * r-x: Others (users who are not the owner and not in the group) have read and execute permissions, but no write permission.

  • 2: This indicates the number of hard links to this directory. In this case, there are 2 hard links.

    ( Hard links in Linux file permissions are a way to create multiple names for the same underlying data on the disk. It's important to understand that they're not copies - they're more like aliases )

  • sujan: This is the owner of the directory.

  • sujan: This is the group associated with the directory.

  • 4096: This is the size of the directory in bytes. Since this is a directory, the size represents the amount of disk space the directory's metadata occupies.

  • Apr 23 06:06: This is the date and time when the directory was last modified.

  • .: This is the name of the directory. In Linux, . represents the current directory.

drwxr-xr-x 10 sujan sujan 4096 Apr 23 06:06 ..
  • drwxr-xr-x: This indicates the file type and permissions of the entry.

    • d: Indicates that this is a directory.

    • rwx: The owner (sujan in this case) has read, write, and execute permissions.

    • r-x: The group (sujan group in this case) has read and execute permissions, but no write permission.

    • r-x: Others (users who are not the owner and not in the group) have read and execute permissions, but no write permission.

  • 10: This indicates the number of hard links to this directory. In this case, there are 10 hard links.

  • sujan: This is the owner of the directory.

  • sujan: This is the group associated with the directory.

  • 4096: This is the size of the directory in bytes. Since this is a directory, the size represents the amount of disk space the directory's metadata occupies.

  • Apr 23 06:06: This is the date and time when the directory was last modified.

  • ..: This is the name of the directory. In Linux, .. represents the parent directory.

-rw-rw-r--  1 sujan sujan    0 Apr 23 06:06 hello.txt
  • -rw-rw-r--: This indicates the file type and permissions of the entry.

    • -: Indicates that this is a regular file.

    • rw-: The owner (sujan in this case) has read and write permissions, but no execute permission.

    • rw-: The group (sujan group in this case) has read and write permissions, but no execute permission.

    • r--: Others (users who are not the owner and not in the group) have read permission, but no write or execute permission.

  • 1: This indicates the number of hard links to this file. In this case, there is 1 hard link.

  • sujan: This is the owner of the file.

  • sujan: This is the group associated with the file.

  • 0: This is the size of the file in bytes. Since the size is 0, the file is empty.

  • Apr 23 06:06: This is the date and time when the file was last modified.

  • hello.txt: This is the name of the file.


File System

File systems in Linux play a crucial role in managing how data is stored, organized, and accessed on storage devices such as hard drives, solid-state drives (SSDs), and USB drives. Each file system has its own features and characteristics that make it suitable for different use cases.

Brief overview of some commonly used file systems in Linux:

  1. ext4: The default file system for many Linux distributions, ext4 is known for its stability, performance, and large file support. It can support up to 1 exabyte file and 16 terabytes file size.

  2. XFS: Designed for large storage systems, XFS is optimized for handling large files and high-bandwidth I/O operations. It supports file sizes up to 8 exabytes and file system sizes up to 18 exabytes.

  3. Btrfs: A newer file system that aims to be more scalable and efficient than traditional file systems. Btrfs supports features such as snapshotting, data integrity checks, and RAID-like functionality. It supports up to 16 exabytes file and file system size.

  4. NTFS: While not native to Linux, NTFS is a Windows file system that can be read and written to by Linux systems using the NTFS-3G driver. It is commonly used for dual-booting systems or accessing data on Windows-formatted drives.

  5. FAT32: Another Windows file system that can be read and written to by Linux systems, FAT32 is often used for USB drives and other external storage devices due to its wide compatibility.

  6. ReiserFS: A journaling file system designed for high performance and efficient use of disk space, ReiserFS is commonly used on systems with limited disk space.

  7. ZFS: Known for its data integrity features such as checksumming, ZFS is designed for large storage systems. While not natively supported by Linux, it can be used through third-party implementations.

Different file systems offer different trade-offs in terms of performance, reliability, and features, so it's important to consider these factors when selecting a file system for a specific purpose.


Searching

Searching is an essential aspect of working with files and directories in Linux. It allows users to locate specific files, directories, or content within files.

Some common commands and techniques for searching in the Linux terminal:

  1. Searching:

    • find: Search for files and directories. Use find <directory> -name <filename> to search for files with a specific name in a directory and its subdirectories.

    • grep: (global regular expression print) Search for patterns in files.

    • Use grep <pattern> <file> to search for a specific pattern in a file.

      Just one use case for now. grep is a powerful command to explore.

      We're searching the word "hello" in all .txt files.

    • locate: Find files by name. Use locate <filename> to quickly find files by name. Note that locate uses a pre-built database, so it may not show the most recent files.

      I have kept the recordings in two different locations.

    • which : Used to locate the executable file associated with a given command.

  2. Wildcard Characters:

    • *: Matches zero or more characters. For example, ls *.txt lists all files with a .txt extension.

    • ?: Matches any single character. For example, ls file?.txt lists files like file1.txt, file2.txt, etc.

  1. History:

    • Use the history command to view a list of recently executed commands.

You can also use the ! followed by the command number to rerun a command from history. For example, !123 reruns command number 123.


File Permissions Management

In Linux, file permission management refers to the process of controlling access to files and directories by setting permissions and ownership. This is an important aspect of Linux security and system administration.

Each file and directory in the Linux file system has permissions and ownership associated with it. The permissions determine who can read, write, and execute the file or directory, while the ownership determines which user and group the file or directory belongs to.

Permissions in Linux are represented by a series of letters and symbols, such as rwxrwxrwx, which indicate the permissions for the owner, the group, and other users.

The letters 'r', 'w', and 'x' stand for read, write, and execute permissions, respectively.

  1. chmod: This command changes the permissions of a file or directory. It can add or remove permissions for the owner, group, or other users.

    The syntax is chmod [options] mode file.

    • Example: chmod 755 example.txt gives the owner read, write, and execute permissions, and gives the group and others read and execute permissions.

Let's get our hands dirty.

We removed the read permissions here.

We now remove read and write permissions together.

To delete permissions use - , to add up use +.

  1. chown: Used to change the ownership of a file or directory. It can change both the owner and group.

    The syntax is chown [options] new_owner:new_group file.

    • Example: chown user2:user2 example.txt changes the owner to user2 and the group to user2.

    • The [options] in the chown command syntax is a placeholder for any additional options or flags that you can use with the command. Options modify the behavior of the command.

  2. ls -l: This command lists files and directories in a directory with their permissions and ownership information. It's used to view the current permissions and ownership of files and directories.

    Let's try further. Here, we have changed the permissions of a group for the first.txt file. We removed read and write permissions.

    We removed the file permission of first.txt file for owner, group and as well as for other. ;)

  3. umask (user file creation mask): In most Unix-like systems, the default permissions for files are 666 and for directories are 777.

    The umask command is useful for maintaining security and privacy on a Linux system. By setting a umask value, you can control the default permissions of files and directories created by you or other users on the system.

    For example, let's say you want to set the default permissions for new files so that they are readable, writable, and executable by the owner, but only readable by the group and others.

    You can achieve this by setting a umask value of 022.

    1. Check the current umask value by simply typing umask in the terminal. This will show you the current umask value, which is usually in octal format (e.g., 022).

    2. Set the umask value to 022 by typing umask 022 in the terminal.

    3. Verify the new umask value by typing umask again. It should now show 022.

Now, when you create a new file, it will have default permissions of 755 (777 - 022 = 755), which means:

  • The owner can read, write, and execute the file.

  • The group and others can only read and execute the file.

This helps ensure that new files you create have the desired permissions, making your system more secure.


Process management

Process management in Linux involves controlling and monitoring the processes running on a system. Here are some key aspects of process management:

1) Viewing Processes:

  1. ps: Used to list the currently running processes.

    a) PID

    • Process ID

b) TTY

  • Terminal (Teletype writer) primarily used to provide terminal emulation for terminal emulators, such as the terminal window you use to interact with the system.

  • The number after pts/ indicates which pseudo-terminal it is (e.g., pts/0 for the first pseudo-terminal, pts/1 for the second, and so on).

c)?

  • is shown if a process is not associated with a terminal, such as background processes.

  • top: Interactive command that provides a dynamic real-time view of processes. It displays a list of processes that are currently running, along with detailed information such as CPU usage, memory usage, process IDs (PIDs), and more.

  • htop: Interactive process viewer similar to top but with more features and user-friendly interface.

2) Managing Processes:

  • kill: Terminates a process. You can use kill with a process ID (PID) to send a signal to a process.

  • killall: Used for terminating multiple processes based on their name. It allows you to specify the name of a process (or a pattern) and sends a signal to all processes matching that name to terminate them

  • pkill: Similar to killall, but with more flexible pattern matching for process names.

3) Background Processes:

  • &: Allows you to start a process in the background.

  • bg: Resumes a stopped background process.

  • fg: Brings a background process to the foreground.

4)Process Prioritization:

  • nice: Runs a command with a modified scheduling priority.

  • renice: Changes the priority of a running process.

5) Process States:

  • Processes can be in various states such as running, sleeping, stopped, zombie, etc. These states indicate the current status of a process.

6) Controlling Process Execution:

  • nohup is a command in Unix and Unix-like operating systems that is used to run a command immune to hangups, with output to a non-tty (non-terminal) device.

# Using ./ before the name of a script or executable tells the system to look for the script or executable in the current directory.
nohup ./my_script.sh &

Networking

Networking in Linux involves configuring and managing network interfaces, connecting to networks, and troubleshooting network issues.

Common Linux networking commands:

  1. ifconfig: Displays the configuration of network interfaces.

    Example: ifconfig eth0

    However, it is deprecated in many Linux distributions in favor of ip.

    ip link show : Lists all network interfaces and their names.

  2. ip addr show: Displays the configuration of network interfaces using the ip command, which is the modern replacement for ifconfig.

    Example: ip addr show eth0

  3. ping: Checks the connectivity to a network host by sending ICMP echo request packets.

    Example: pingexample.com

  4. traceroute: Shows the route that packets take to reach a destination host, displaying each hop along the way.

    Example: tracerouteexample.com

  5. nslookup: Queries DNS servers to obtain domain name or IP address mapping information.

    Example: nslookupexample.com

  6. netstat: Displays network connections, routing tables, interface statistics, etc.

    Example: netstat -tu (to show active TCP and UDP connections)

  7. route: Manages the IP routing table, including viewing and adding routes.

    Example: route add -net 192.168.1.0 netmask 255.255.255.0 gw 192.168.0.1 (adds a route to the 192.168.1.0/24 network via gateway 192.168.0.1)

iptables: Manages firewall rules on a Linux system.

Example: iptables -A INPUT -p tcp --dport 80 -j ACCEPT (adds a rule to allow incoming TCP traffic on port 80)


System Administration

System administration in Linux involves maintaining and managing the various components of a Linux system, including hardware, software, and network resources.

Some Linux system administration commands:

  1. df: View file system disk space usage.

    df-h : To view the usage in human readable format

  2. du: Estimate file space usage.

  3. free: Display amount of free and used memory in the system.

  4. uptime: Show how long the system has been running.

  5. uname: Print system information.

  6. systemctl: Control the systemd system and service manager.

  7. yum or apt-get: Install, update, or remove software packages.

  8. useradd and userdel: Add or delete a user account.