Tags:concept Status:🟨


Types of data

Summary

Structured Data

Definition: Highly organized data, typically stored in tables with rows and columns. It follows a strict schema.

Characteristics:

  • Data is stored in a predefined format (schemas).
  • Relational data with clear relationships (e.g., primary and foreign keys).
  • Easily searchable using Structured Query Language (SQL).

Examples:

  • Employee records (e.g., names, roles, salaries).
  • Sales data in an e-commerce platform.

Use Cases:

  • Banking systems to manage transactions.
  • Inventory management in retail.

Storage:

  • Stored in relational database management systems (RDBMS) like MySQL, PostgreSQL, or SQLite.

Semi-Structured Data

Definition: Data with a flexible structure that uses tags or markers (like XML or JSON) to organize information, but without a strict schema.

Characteristics:

  • Partially organized but not as rigid as relational databases.
  • Can be nested or hierarchical, meaning objects inside objects (i.e., JSON objects).
  • Requires special tools for querying (e.g., XPath, JSONPath).

Examples:

  • XML or JSON files for API responses.
  • NoSQL databases like MongoDB.

Use Cases:

  • Storing configuration files for software.
  • Managing social media comments or chat logs.

Storage:

  • Commonly stored in NoSQL databases or files like XML and JSON.

Unstructured Data

Definition: Data without any predefined schema or structure, making it more complex to process and analyze.

Characteristics:

  • No uniform format.
  • Often includes large binary objects (e.g., images, videos).
  • Requires advanced tools for analysis (e.g., AI or ML).

Examples:

  • Multimedia files (e.g., photos, music, videos).
  • Raw text files or documents.

Use Cases:

  • Analyzing customer reviews for sentiment.
  • Content management systems for websites.

Storage: