Q1). What is data integrity?
Data integrity refers to the accuracy and consistency of data over its lifecycle.
For example: in an online banking system, data integrity ensures that your account balance remains accurate after every transaction. If you deposit $100, the balance should increase by $100, and if there’s any discrepancy, it indicates an issue with data integrity.
Q2). What is a schema?
A schema is a blueprint that defines the structure of a database, including tables, fields, and relationships.
For example: in a library database schema, you might have tables for books, authors, and borrowers. The schema defines how these tables are related, such as linking books to authors and borrowers to books they have checked out.
Q3). What is a database table?
A database table is a collection of related data organized in rows and columns.
For example: an 'Employees' table might have columns for employee ID, name, and department, with each row representing a different employee. This allows you to easily store and retrieve employee information.
Q4). What is a foreign key?
A foreign key is a field in one table that uniquely identifies a row of another table.
For example: in an 'Orders' table, a 'customer_id' field might act as a foreign key that links to the 'Customers' table. This helps in identifying which customer placed a particular order.
Q5). What is a primary key?
A primary key is a unique identifier for a record in a table.
For example: in a student database, each student might have a unique student ID as the primary key. This ensures that each student's record can be uniquely identified and accessed without confusion.
Q6). What is SQL injection?
SQL injection is a security vulnerability that allows an attacker to manipulate a database query.
For example: if a login form does not properly sanitize user input, an attacker might input malicious SQL code like 'OR 1=1' to bypass authentication and gain unauthorized access to the database.
Q7). What is a database index?
A database index is a data structure that improves the speed of data retrieval operations.
For example: if you frequently search for products by their name, creating an index on the product name column will make these searches faster by quickly locating the desired data without scanning the entire table.
Q8). What is a view in a database?
A view is a virtual table based on the result of a query.
For example: you might create a view to display only active employees from a larger 'Employees' table. This simplifies queries and enhances security by showing only the relevant data.
Q9). What is a stored procedure?
A stored procedure is a precompiled collection of SQL statements stored in the database.
For example: you might use a stored procedure to automate the process of generating a monthly sales report, which can be executed with a single command rather than writing complex queries each time.
Q10). What is a trigger in a database?
A trigger is a set of instructions that are automatically executed in response to certain events on a table.
For example: you might set up a trigger to automatically update inventory levels in the 'Products' table whenever a new order is placed, ensuring stock levels are always accurate.
Q11). What is a transaction?
A transaction is a sequence of operations performed as a single unit of work.
For example: transferring money from one bank account to another involves multiple operations (debiting one account and crediting another) that must be completed together or not at all to ensure data consistency.
Q12). What is denormalization?
Denormalization is the process of adding redundancy to a database to improve performance.
For example: in a high-traffic e-commerce site, you might denormalize data by storing frequently accessed information, like product prices, directly in the 'Orders' table to speed up queries and reduce the need for complex joins.
Q13). What is data warehousing?
Data warehousing involves collecting and managing data from various sources to provide meaningful business insights.
For example: a retail company might use a data warehouse to consolidate sales data from multiple stores, allowing for comprehensive analysis and reporting on overall sales performance.
Q14). What is data mining?
Data mining is the process of discovering patterns and knowledge from large amounts of data.
For example: a company might use data mining techniques to analyze customer purchase history and identify trends, such as the most popular products or buying patterns, to tailor marketing strategies.
Q15). What is ETL (Extract, Transform, Load)?
ETL is a process used to collect data from various sources, transform it into a suitable format, and load it into a database or data warehouse.
For example: ETL might be used to gather sales data from different regional offices, clean and aggregate it, and load it into a central data warehouse for unified reporting.
Q16). What is a database schema?
A database schema defines the structure of a database, including tables, columns, and the relationships between tables.
For example: in an e-commerce database schema, you might have tables for customers, orders, and products, defining how they relate to each other (e.g., orders link to customers and products).
Q17). What is database replication?
Database replication involves copying data from one database to another to ensure consistency and reliability.
For example: a company might replicate its customer database across multiple servers to ensure that if one server fails, the data is still available on another server.
Q18). What is database sharding?
Database sharding involves dividing a large database into smaller, more manageable pieces called shards.
For example: a social media platform might shard its user database by region to distribute the load and improve performance, ensuring that users from different regions are served by different database instances.
Q19). What is a materialized view?
A materialized view is a database object that stores the result of a query physically, providing faster access to the data.
For example: a reporting application might use a materialized view to store precomputed sales data aggregated by month, allowing for quicker report generation.
Q20). What is a data mart?
A data mart is a subset of a data warehouse that focuses on a specific business area.
For example: a marketing data mart might contain detailed data related to marketing campaigns, customer interactions, and lead conversions, enabling targeted analysis for the marketing team.
Q21). What is database partitioning?
Database partitioning divides a large database into smaller, more manageable pieces.
For example: a company might partition its transaction database by date, so that queries and maintenance tasks are faster and more efficient, especially when dealing with large amounts of historical data.
Q22). What is a distributed database?
A distributed database is spread across multiple physical locations, allowing for data to be managed and accessed from different places.
For example: a global company might use a distributed database to ensure that employees across various regions can access data efficiently without having to rely on a single centralized server.
Q23). What is eventual consistency?
Eventual consistency means that data will eventually become consistent across all nodes, but there may be temporary discrepancies.
For example: in a distributed shopping cart system, if an item is added to a cart, it might take a short time for the change to be visible to all users due to synchronization delays.
Q24). What is CAP theorem?
The CAP theorem states that a distributed database can only guarantee two out of three properties: Consistency, Availability, and Partition Tolerance.
For example: in a distributed system, you might need to choose between ensuring all nodes have the same data (consistency) and ensuring the system remains available during network partitions (partition tolerance).
Q25). What is an OLAP cube?
An OLAP cube is a multi-dimensional database structure used for complex queries and data analysis.
For example: in a sales analysis system, an OLAP cube might allow you to analyze sales data by dimensions such as time, location, and product category, enabling detailed and flexible reporting.
Q26). What is an OLTP system?
An OLTP (Online Transaction Processing) system is designed for managing and processing real-time transactional data.
For example: an e-commerce platform’s order processing system is an OLTP system that handles transactions, updates inventory, and processes payments in real time.
Q27). What is data consistency?
Data consistency ensures that data is accurate and reliable across all instances.
For example: if you update your address in an online shopping website, data consistency ensures that your new address is reflected correctly in all records, including order history and shipping details.
Q28). What is a transaction log?
A transaction log records all changes made to a database, which helps in recovering data after a failure.
For example: if a database server crashes, the transaction log can be used to replay the changes and restore the database to its most recent state before the crash.
Q29). What is a NoSQL database?
A NoSQL database is a non-relational database that provides flexible data models for handling large volumes of unstructured or semi-structured data.
For example: MongoDB is a NoSQL database that stores data in JSON-like documents, making it suitable for applications with varied and evolving data structures.
Q30). What is database optimization?
Database optimization involves tuning and adjusting a database to improve its performance and efficiency.
For example: adding appropriate indexes and optimizing queries can speed up data retrieval times and reduce the load on the database server.
Q31). What is a database cache?
A database cache stores frequently accessed data in memory to reduce the time needed to retrieve it.
For example: a website might use a cache to store user session data, allowing faster access to user information and improving the overall performance of the site.
Q32). What is a key-value store?
A key-value store is a type of NoSQL database where data is stored as a collection of key-value pairs.
For example: Redis is a key-value store where you might store user preferences as key-value pairs, with the key being the user ID and the value being their settings.
Q33). What is an ACID property?
ACID stands for Atomicity, Consistency, Isolation, and Durability, which are properties that ensure reliable transactions in a database.
For example: if a bank transaction involves transferring money between accounts, ACID properties ensure that the transaction is completed successfully, maintains data integrity, and can recover from failures.
Q34). What is an LSM tree?
An LSM (Log-Structured Merge) tree is a data structure used to handle high write volumes efficiently.
For example: in a logging system, an LSM tree helps manage and optimize the storage of log data by batching writes and periodically merging them.
Q35). What is a distributed ledger?
A distributed ledger is a decentralized database that is replicated across multiple nodes.
For example: blockchain technology is a type of distributed ledger used in cryptocurrencies to maintain a secure and transparent record of transactions.
Q36). What is database normalization?
Database normalization is the process of organizing data to reduce redundancy and improve data integrity.
For example: in a university database, normalization might involve separating student information into distinct tables for personal details, courses, and grades to avoid duplication.
Q37). What is a data lake?
A data lake is a centralized repository that stores raw data in its native format.
For example: a data lake might store unstructured data from various sources, such as social media posts, IoT sensor data, and log files, allowing for flexible and comprehensive analysis.
Q38). What is a materialized view?
A materialized view is a database object that stores the result of a query physically, providing faster access to the data.
For example: a reporting system might use a materialized view to store aggregated sales data for quick retrieval during reporting.
Q39). What is database federation?
Database federation involves integrating multiple databases into a single system so users can query them as if they were one.
For example: a company might federate its HR, sales, and finance databases to provide a unified view of employee performance metrics across departments.
Q40). What is a bloom filter?
A bloom filter is a probabilistic data structure used to test whether an element is a member of a set.
For example: a bloom filter in a search engine might quickly check if a search term is likely to be in a large dataset, reducing the need for more costly lookups.
Q41). What is data governance?
Data governance involves managing the availability, usability, integrity, and security of data within an organization.
For example: a company might implement data governance policies to ensure that customer data is accurate, protected, and used consistently across departments.