Week 6 — Learning Basic Concepts of Cybersecurity
Hi there! If you’re wondering who I am, I go by @iamaangx028 on the internet — you can call me Aang 2025-7-28 06:1:57 Author: infosecwriteups.com(查看原文) 阅读量:15 收藏

Aang

Hi there! If you’re wondering who I am, I go by @iamaangx028 on the internet — you can call me Aang :)

I am a student who is trying to get into the cybersecurity field. As a part of that journey, I would like to share my progress with all of you through weekly blogs.

Just a small note for continuity…

So far, we have covered many important concepts related to the network, browser, and HTTP evolution in our previous blogs. Check out those previous blogs here for better continuity. In this week’s blog, we will be understanding some of the important concepts of System Design. Let’s go!!

Zoom image will be displayed

Let us start the week — 6

Before we dive into the Web application architecture, we need to understand the following concepts. We may not need to have in-depth knowledge of every component, but still, it is required to know the basic components of System Design and how they work.

Zoom image will be displayed

Top most important concepts in System Design

First, we need to understand the most important concepts of System design, and then how these are used to build modern web applications. Once we have a solid understanding of these concepts, we can then understand HTTP headers and further related topics.

I think we have a solid understanding of some concepts like Client-server Architecture, IP addresses, DNS, and HTTP/HTTPS. So, we need not cover them again here. Also, here we just cover the basic understanding, not in-depth knowledge. Please feel free to research a specific concept if you want. Let us start with the remaining concepts.

Proxy/ Reverse Proxy

A proxy is a server that acts like a man-in-the-middle. Whenever you send a request, it first goes to the Proxy, and then the Proxy sends it to the server. These Proxies hide the Original requester’s IP address by giving their IP address to the receiver.

Whereas a Reverse proxy acts as a proxy to the server. So when a client sends the request to the server, it first reaches the reverse proxy, and then it forwards the request to the server based on the pre-defined rules. So there are two different types of Proxies: Forward Proxy and Reverse Proxy

Forward Proxy:

Zoom image will be displayed

Credits to PowerCert

So a forward proxy is there to protect the client. Also, a forward proxy can cache the pages that clients visit more often. A Forward proxy can be used for anonymity, as it can hide the requester’s IP. A Forward Proxy can log all of the requests made by the clients. So, if anything happens, engineers can analyze it. A Forward proxy can be used to bypass the restriction of the normal connection. Organizations often use these forward proxies, and employees need to connect to them.

Reverse Proxy:

Zoom image will be displayed

Credits to PowerCert

So, a Reverse proxy is there to protect the Server. A Reverse Proxy can be used as a Load Balancer too. It can serve the static files to the clients without needing to ask the servers. It can terminate the SSL and look into the request and understand what the client is requesting.

Latency

Latency is the amount of delay between the request and the response. Say, for example, you send a request to a server that is in another country. Then the request has to travel to the server and come back to you with a response. So to do this round trip, it may take some time. Which is what we call Latency.

APIs

Nowadays, web applications are doing more complex operations. The request & responses should be in a structured format, for the client and server to process them efficiently. So, that’s where we use APIs. API stands for Application Programming Interface. This API acts as a middleman between the client and the server. Now the client and server can use this API without needing to worry about the underlying implementation.

Rest APIs

APIs are of two types, REST API and SOAP API. The main difference between the two is, REST API is stateless. Which means each request is independent of other requests. Given that, in REST API, we need to provide the session cookie or the Authorization Header in every request to prove our identity. REST API are the most used APIs. REST supports HTTP methods like GET, POST, DELETE, and PUT. And web applications can use these methods to perform different operations.

GraphQL

GraphQL is another type of API. This is developed by the Facebook team. To overcome some of the limitations of the REST API, the Facebook team created this. In the REST API, sometimes, the response JSON contains more data than necessary. And REST API can do only one operation/function at a time. But GraphQL can overcome these limitations by specifying which fields should be reflected in the response. And can make multiple operations in a single call.

Databases

Databases are used to store information. We cannot store so much data in memory. So, to store users' data, we use separate Database servers. Usually, all of the user data is stored in these databases. The server usually reads and writes the data into the database.

SQL vs NoSQL

There are typically two types of Database servers, namely SQL and NoSQL. SQL databases are the traditional databases that store data in table formats. SQL databases have a predefined Schema. Whereas NoSQL databases do not have a fixed schema and are highly scalable and flexible. The data can be stored in any format you like, say, Key-value pairs or graphs.

Vertical Scaling

Whenever the traffic coming to your website increases, the load on the backend server increases. At some point, the server cannot handle the requests anymore. Then we may need to upgrade the Server by increasing the CPU, storage, and memory. So, like that, we can handle the incoming requests by upgrading the existing server. But it is not always possible to upgrade the existing server. Even if it may lead to other risks. That’s where Horizontal scaling comes into play.

Horizontal scaling

Horizontal scaling is where you will not upgrade the existing server specs, but instead buy a new server with the same/fewer specs. So here you may have two medium-scale servers, instead of a large server. Even if one server goes down due to some problem, the other server can pick it up immediately.

Load balancer

So, say for example, you have five medium-scale servers. And there is huge incoming traffic to your website. Then we may choose to have a Load balancer in front of the Servers. A Load balancer can balance the large incoming requests by intelligently forwarding the requests to all five different servers you have. Say, for example, if one of five servers is struggling to process the requests. Then the load balancer intelligently forwards some of the load to the other servers. And I think that’ how all of these large E-commerce sites balance huge incoming traffic on their websites, during sales/festival seasons.

Zoom image will be displayed

Credits to CodeLit

Zoom image will be displayed

Credits to CodeLit

Proxy

The meaning of the word Proxy is the ability or authority to do something on behalf of someone. So, a Server acting as an entry point on behalf of the Original Server is called a Proxy Server.

Reverse Proxy Vs Load Balancer

We have discussed both what Reverse proxies and Load balancers are. So, I mentioned that a Reverse proxy can be used as a load balancer also. So, do we need to use both or any one? We may need to use both if your preference is the security of your internal system.

You can place the load Balancer on the Internet as a public subnet and let it communicate to the reverse proxy, which is in the internal network. And then again, that Reverse proxy can distribute the load on the servers based on the configuration. The reason is that a Reverse proxy can provide very granular control over the distribution of traffic to servers. The reverse proxies support different traffic distribution algorithms, not just Round Robbin or other algorithms. Reverse proxies can handle a huge amount of incoming traffic. And SSL termination helps reverse proxies work even better, because based on the request, it can forward it to the respective server. So it is a good idea to use the Reverse proxy along with a Load Balancer.

Zoom image will be displayed

Credits to Nana

Database Indexing

Databases can store so much data. But sometimes, due to a large amount of data, the reads and writes to the database can be slower. Due to this, the efficiency of the website decreases. That’s why sometimes, Database indexing can be used. Think of it like an Index page for a book. Like how the index page in the books helps in finding content you want, Database indexing also helps us in the same way. In the database indexing, we will have a Database lookup table that can direct us to the required column or row directly. In this, writing to the database can be slower.

Sharding

Sometimes, due to huge data, database read and write can be slower. So, to avoid that, we use the sharding method. In the Sharding method, we divide a huge dataset in the database into smaller parts. Say, for example, we have data of 100 users in a database. Then, we may divide that into two or three smaller databases, like 1 to 50 users in the first database and 50 to 100 remaining in the second database. That’s how huge databases can be managed.

Replication

Sometimes, Database Indexing may also not be efficient depending on the data size. Then we can use the Replication method. In this method, we will have multiple copies of the database. We will also have a Primary Replica, which is responsible for Write operations. Whenever a write operation happens in the Primary replica, then all of the other copies will be synced to the most updated write. These replicas help in the read operations. So I think, if the application deals with something that requires users to retrieve/read the data, then this would be a good option to speed up the read and write when dealing with large data.

Vertical Partitioning

Suppose we have a database with a huge number of columns. So, whenever you need to read, write, or update some data in the database, it may not be efficient. So, that’s where we vertically partition a database’s column into different databases. So, all of the databases have different columns. And whenever there is a request for read or write, it can directly go to the database that contains that column instead of processing the whole large database. While this improves the efficiency, using this method can make update operations difficult.

Caching

In Web applications, files like CSS, JavaScript, and HTML are requested by the clients multiple times. So, if every time users request these files, it will consume most of the server’s CPU to serve these files to the clients. So, to avoid such situations, a caching mechanism is implemented. So, Cache stores the non-sensitive but important files like HTML, CSS, and JS to be able to serve them to the users. So, whenever a user sends a request for fetching a static file (say, for example, HTML) then it goes to the cache server, and then it checks whether the requested resource is available in the cache or not. If the resource is available, then that file will be served directly to the user. If not, then that request will be forwarded to the server. Then the server sends back the appropriate response. The cache now stores this new file too. And these files in the cache typically have a TTL (Time To Live) value, after which they expire.

Zoom image will be displayed

Credits to codeLit

DeNoramlization

Web applications can have different databases for different data. Say, for example, there is a users table and a funds table. Now, whenever we operate, JOIN is used to join the databases, which can significantly affect the efficiency. But if you think you only care about the speed, not the cost. We can store a combined copy of both tables into a single database. So, it can make query processing faster, but it occupies more storage and costs, and is even difficult to update.

CAP Theorem

In Web design, CAP theorem stands for Consistency, Availability, and Partition Tolerance. According to this, you can only choose 2 out of Consistency, Availability, and Partition Tolerance in a distributed system. Honestly, I didn’t go deep into it. I felt it was not that important to understand.

But I don’t know whether it's that important! I remembered the CIA triad in security 😅 when I saw this.

Blob Storage

Typical Databases store information like Username, First Name, Last Name, PIN, Home Address, Debit Card, etc. But files like images, Videos, HTML, CSS, and perhaps even JavaScript cannot be stored on the Server. That’s why these blob Storages are very useful. These blob storages can store all of these file types, and can be served to users efficiently. Blob storage helps us specify granular permissions on a file. Mostly cloud blob storage services are used for this, like Amazon, GCP, Azure, etc.

CDNs

CDN stands for Content Delivery Network. These help us deliver the content to users efficiently. For example, suppose your application server is located in India and you attempt to access it from another country, which is half a globe away. Then there will be some delay in the request and response, which is what we call “Latency”. Given the fact that the request and response have to travel huge distances, there would be some delay. This will make users feel the page is unresponsive. So, to overcome this, we can use CDNs. These CDNs are placed across the globe in an intelligent way, such that they cover a large number of users. So when you use an application that uses a CDN, you technically connect to the nearest CDN. These CDNs deliver the content.

Websockets

Some live applications, like gaming, stock price tracking, need to constantly poll the requests from the server, which will consume huge resources of the server and waste bandwidth. So, we need a way that get the live data from the server. That’s where WebSockets come into play. WebSockets can establish a secure connection between the client and the server. The communication can be bidirectional. Then the data can be retrieved from the server in real time without the wastage of bandwidth.

Webhooks

Webhooks are used where an application needs to send data to another application when some event occurs. Say, for example, you paid the fee for a subscription; then, the money goes to the payment service provider. Then that payment service provider will send the respective data, such as amount, data, transaction fee, etc. To the application. Then the application can give you the respective features. But here, the payment service provider uses Webhooks to send the data to the application whenever there is a successful payment.

Microservices

Microservices refer to different parts of a web application. Each of these microservices is independent of the other microservices. And all of these have their database and logic. These microservices can be scaled up and down whenever needed independently. So, if you want, you can scale up the microservice you need while keeping others as is. And these microservices can talk to each other using message queues.

Message Queues

As discussed above, message Queues help microservices communicate with each other. Using API calls may not be efficient for this type of inter-microservice communication. So, these message queues are very helpful. These message queues work asynchronously, and consumers can place messages in the message queue; then, providers can pick them up whenever required. Some examples of Message queues are RabbitMQ, Kafka, or Amazon SQS.

API gateway

API Gateway is where all of the microservices can be managed. This is the entry point to the requests to consume the resources. An API gateway can manage authentication, authorization, and even rate limiting.

Rate limit

Rate Limit is a technique where a server or gateway can limit the number of requests a user can make to the server or microservices. This is handled by the API gateway itself. There is no need to implement custom rate limits. Rate Limit is important to make sure attackers cannot exhaust the server resources by making a large no of requests.

Idempotency

Idempotency is a technique used to ensure duplicate requests are handled only once. For example, when you are making a payment, a request will be sent to the server. But if you refresh, then a new request will be made. This is where Idempotency can help the server identify whether this request is already served or not. This is done by assigning each request a unique ID. Therefore, even if a request is made twice, the server can recognize it by examining the ID.

Application servers

These are the servers that have an application server that sits behind the Web servers. These application servers are responsible for business logic processing. Whenever user requests a resource that requires dynamic logic processing, then web server forwards that request to the application server.

Web servers

These are the servers that sit before the application servers. The web servers can only provide static resources like HTML, JS, and CSS. So modern web applications use the web server & application server. These can also act as Load balancers. Refer to the working of NGINX below; most of the Web servers support those features.

Different Types of Application Servers

  1. Apache Tomcat
  2. NGINX
  3. Node JS
  4. Microsoft IIS
  5. LiteSpeed
  6. Caddy

Different Types of Web Servers

  1. NGINX
  2. Apache
  3. LiteSpeed
  4. Openresty
  5. Cloudflare
  6. Microsoft IIS
  7. lighttpd
  8. Apache Tomcat ( Java Applications)
  9. Caddy

So, nowadays modern servers are acting as both a web server and an application server. Some of the common servers from the above can act as both.

Working of NGINX

NGINX is arguably one of the most used Servers. NGINX is used as a Web Server, Proxy Server, and even as a Load Balancer. NGINX is widely famous for its capabilities, such:

  1. It can handle a huge number of requests at a time.
  2. It can store large static content and serve it to the clients, reducing the load on the web servers
  3. Easy to configure
  4. Lightweight and easy to set up when compared to Apache
  5. Acts as a load balancer.
  6. Supports Caching capabilities.
  7. Supports SSL/TLS encryption and decryption
  8. Most importantly, it supports granular Access controls when acting as a reverse proxy
  9. It supports different algorithms for routing traffic to internal servers.
  10. It is widely used in the container world. It is used as an Ingress Controller, which can forward and act as a load balancer, that can intelligently forwarding traffic to appropriate microservices in Kubernetes.

These are some of the most significant capabilities of NGINX. Most web applications use NGINX somewhere in their architecture. So, whenever we are about to do some testing on web applications, there is a high chance that we will see that the application server is behind an NGINX reverse proxy. So I kinda felt like covering this important topic.

Some Final Chit-Chats

Ahh, that’s a wrap for this week! Ideally, we should be covering HTTP Headers this week, but I thought we needed to learn the basic system design concepts first nd then go to the HTTP servers. So, it would make more sense while learning HTTP Headers. Now, we have covered some important components of System Design. In the next week, we will try to cover any remaining important concepts related to this; if not, we can start the HTTP headers. So, see you on next weekend. Have a good day!

Got feedback, corrections, hit me up on X if you have something interesting to discuss!


文章来源: https://infosecwriteups.com/week-6-learning-basic-concepts-of-cybersecurity-d2a27e136f24?source=rss----7b722bfd1b8d--bug_bounty
如有侵权请联系:admin#unsafe.sh