Customise Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorised as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyse the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customised advertisements based on the pages you visited previously and to analyse the effectiveness of the ad campaigns.

No cookies to display.

PostgreSQL DBA Interview Questions

Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
  • User AvatarAshiwini
  • 30 Jan, 2025
  • 0 Comments
  • 38 Mins Read

PostgreSQL DBA Interview Questions

  1. What is PostgreSQL?

    PostgreSQL is a lightweight, free, and open-source relational database management system. PostgreSQL is used widely across regions and companies and can be used in most popular operating systems.

  2. What are the benefits of PostgreSQL?

    PostgreSQL excels among other SQL databases for several reasons, including:
    1.Robustness that makes it suitable for all kinds of applications
    2.Free and open-source
    3.Security and reliability
    4.Wide variety of data types
    5.A big community of users worldwide.

  3. How connection is established in postgres?

    The PostgreSQL server (formerly known as Postmaster) acts as the main supervisor process, listening on a designated TCP/IP port for incoming client connections. Upon receiving a connection request, it spawns a separate backend process to handle that session. These backend processes communicate with each other and other instance components using shared memory and semaphores, ensuring data consistency during concurrent access. Once connected, the client sends queries to its assigned backend process, which then parses, plans, and executes the query. The results are transmitted back to the client over the established connection.

  4. Explain the postgres architecture ?

    PostgreSQL follows a multi-process architecture, where each client connection is handled by a separate backend process. The PostgreSQL server process (previously known as Postmaster) listens on a specified TCP/IP port for incoming client requests. When a connection request arrives, it spawns a dedicated backend process to handle that session. These backend processes interact with shared memory and use inter-process communication (IPC) mechanisms like semaphores and spinlocks to ensure data consistency. The architecture consists of key components such as shared memory (used for caching and transaction management), background processes (like WAL Writer, Checkpointer, and Autovacuum Daemon), and storage structures (such as tables, indexes, and transaction logs). PostgreSQL employs MVCC (Multi-Version Concurrency Control) to allow multiple transactions to run simultaneously without blocking each other. The WAL (Write-Ahead Logging) mechanism ensures data durability by logging changes before applying them to data files. This modular and process-based architecture enhances PostgreSQL’s scalability, reliability, and performance, making it suitable for both transactional and analytical workloads.

  5. What are some important background processes in postgres?

    PostgreSQL has several important background processes that ensure smooth database operation:

    1. Checkpointer – Flushes dirty pages from shared buffers to disk at regular intervals to reduce write overhead.
    2. WAL Writer – Writes transaction logs (WAL – Write-Ahead Logging) to ensure data durability and crash recovery.
    3. Autovacuum Daemon – Cleans up dead tuples to prevent table bloat and optimize performance.
    4. Background Writer – Helps reduce I/O spikes by pre-writing dirty pages to disk before checkpoints occur.
    5. Archiver – Handles WAL file archiving when continuous archiving is enabled.
    6. Stats Collector – Gathers database statistics for query optimization and performance monitoring.
    7. Logical Replication Launcher – Manages logical replication workers for data synchronization between databases.
    8. WAL Receiver/Sender – Handles streaming replication between primary and standby servers.

    These background processes enhance performance, data integrity, and high availability in PostgreSQL.

  6. What are the memory components in postgres?

    PostgreSQL has several important memory components that optimize performance and manage database operations efficiently:

    1. Shared Buffers – Caches frequently accessed data pages to reduce disk I/O.
    2. Work Memory – Allocated per query for sorting and hashing operations.
    3. Maintenance Work Memory – Used for maintenance tasks like vacuuming and indexing.
    4. WAL Buffers – Temporary storage for Write-Ahead Logging (WAL) before writing to disk.
    5. Effective Cache Size – An estimate of how much OS cache PostgreSQL can use for query planning.
    6. Temp Buffers – Session-local buffers used for temporary tables.
    7. Kernel Page Cache – OS-level caching that stores frequently accessed data to improve performance.

    These memory components help PostgreSQL handle queries efficiently, minimize disk access, and optimize overall performance.

  7. what is the maximum file size of table or index in postgres? Can we increase that ?

    Max size is 1GB. If a table size is big, then it can spread across multiple files.

  8. When wal writer write data to wal segement?

    The WAL Writer process writes data to WAL (Write-Ahead Logging) segments periodically to ensure durability and crash recovery. It writes data when:

    1. WAL Buffers are Full – If the WAL buffer fills up, WAL Writer flushes data to WAL segments.
    2. Timeout Occurs – It writes at regular intervals (wal_writer_delay, default 200ms) to prevent data loss.
    3. Transaction Commit – WAL entries are flushed when a transaction commits to ensure durability.
    4. Checkpoints Trigger – During a checkpoint, WAL data is written and synced to disk for consistency.
  9.  When bgwriter writes data to disk?

    The Background Writer (bgwriter) writes dirty pages from shared buffers to disk proactively to reduce I/O spikes and improve performance. It operates independently of checkpoints and writes data at regular intervals based on bgwriter_lru_maxpages and bgwriter_lru_multiplier settings. The primary triggers for writing include buffer eviction (when new pages are needed), time-based intervals (bgwriter_delay), and system workload. By gradually writing pages, it prevents sudden bursts of disk I/O during checkpoints, enhancing PostgreSQL’s overall efficiency and responsiveness.

  10. What is restore_command?

    restore_command in PostgreSQL is a configuration setting that specifies the command used to retrieve archived WAL (Write-Ahead Log) files during recovery or replication.

  11. What is the significance of pg_ident.conf file?

    Just like we have os authenticated db users in oracle. Here in postgres also we have similarconcept. We can provide mapping of os user and postgres db user inside pg_ident.conf file.

  12. What is recovery_target_time?

    recovery_target_time is a configuration parameter in PostgreSQL that specifies the exact point in time to which the database should be recovered during point-in-time recovery.

  13. What is pg_dumpall?

    pg_dumpall is a PostgreSQL utility that backs up the entire database cluster—including all databases, roles, and global objects—into a single dump file.

  14. What are some wal related parameter in postgrs.conf file?

    max_wal_size → It defines the soft limit for the total WAL segment size. If this limit is reached, a checkpoint is triggered to free up space.
    wal_keep_segments → Specifies the number of old WAL segments to retain for standby servers. (Deprecated in newer versions, replaced by wal_keep_size.)
    wal_keep_size → Defines the size (in MB or GB) of WAL files to retain for replication before they are recycled.
    max_wal_senders = 10 → Sets the maximum number of WAL sender processes that handle streaming replication to standby servers.

  15. What is a tablespace?

    In PostgreSQL, a tablespace is a storage location on disk used to store database objects like tables and indexes. It helps in managing disk usage and optimizing performance by placing data on specific storage devices.

    Example of Creating a Tablespace:
    CREATE TABLESPACE my_tablespace LOCATION '/data/my_tablespace';
  16. What is checkpoint? When checkpoint happens in postgres?
  17. Which parameters controls the behaviour of BGWriter?

    The Background Writer (bgwriter) behavior in PostgreSQL is controlled by the following parameters:

    1. bgwriter_delay → Defines the interval (in ms) between background writer runs (default 200ms).
    2. bgwriter_lru_maxpages → Sets the maximum number of dirty pages written to disk per bgwriter cycle.
    3. bgwriter_lru_multiplier → Determines how aggressively bgwriter writes pages based on buffer usage (default 2.0).
  18. Does postgres support direct i/o.?

    No, PostgreSQL does not support Direct I/O natively. It relies on the operating system’s page cache for disk I/O operations, which improves performance by reducing direct disk access.

  19. What is the difference between TRUNCATE and DELETE?

    TRUNCATE is faster, removes all rows at once
    DELETE removes rows one by one, can use WHERE clause

  20. What is continuous archiving?

    Method of archiving WAL files continuously
    Essential for PITR capability

  21. What is autovacuum in PostgreSQL?

    Automated process that removes dead tuples and updates statistics
    Essential for maintaining database performance

  22. How do you create and drop a database in PostgreSQL?

    To create a database in PostgreSQL, use the following command:

    CREATE DATABASE dbname;
    

    To drop a database, use:

    DROP DATABASE dbname;
    

    Ensure no active connections to the database before dropping it. You can check for active connections using pg_stat_activity.

  23. How do you take a hot backup?

    To take a hot backup in PostgreSQL using pg_basebackup while the database is running:

    1. Enable ARCHIVE_MODE:
      Set archive_mode = on and configure archive_command in postgresql.conf, then restart PostgreSQL.
    2. Run pg_basebackup:
      Use pg_basebackup to take a backup:

      pg_basebackup -h localhost -D /path/to/backup -U replicator -P --wal-method=stream
  24. How do you create a user and assign privileges in PostgreSQL?

    To create a user in PostgreSQL, use the CREATE USER command:

    CREATE USER username WITH PASSWORD 'password';
    

    To assign privileges, you can grant specific permissions like SELECT, INSERT, UPDATE, etc., using the GRANT command. For example:

    GRANT ALL PRIVILEGES ON DATABASE dbname TO username;
    

    This grants the user all privileges on the specified database. You can also grant table-specific privileges:

    GRANT SELECT, INSERT ON table_name TO username;
  25. How do you take a full database backup using **pg_dump**?

    To take a full database backup using pg_dump, use the following command:

    pg_dump -U username -F c -b -v -f /path/to/backupfile.dbname.tar dbname
  26. How do you manage users and roles in PostgreSQL?

    In PostgreSQL, manage users and roles using SQL commands: create users with CREATE ROLE username WITH LOGIN PASSWORD 'password';, assign roles with GRANT role TO user;, and set permissions with GRANT SELECT ON table TO user;. Modify roles with ALTER ROLE, remove with DROP ROLE, and manage privileges using REVOKE. Use pg_roles view to monitor roles and permissions.

  27. How authentication happens in postgres?
    1. Password-Based Authentication – Uses MD5, SCRAM-SHA-256, or plain passwords for user verification.
    2. Trust – Allows connections without authentication (used for local, trusted environments).
    3. Peer Authentication – Uses OS user credentials to match PostgreSQL roles (Unix/Linux only).
    4. Ident Authentication – Relies on an external Ident server to verify the user.
    5. Kerberos, LDAP, PAM, and SSPI – Supports enterprise authentication mechanisms for secure login.
    6. Certificate-Based Authentication – Uses SSL/TLS certificates for secure connections.
  28. What is the significance of pg_ident.conf file?

    The pg_ident.conf file in PostgreSQL is used for user mapping when using Ident or external authentication methods like LDAP, Kerberos, or PAM. It allows mapping OS-level or external usernames to PostgreSQL roles, enabling flexible authentication.

  29. What is recovery.conf?

    In PostgreSQL <12, the recovery configuration is managed using the recovery.conf file.
    In PostgreSQL ≥12, recovery is managed using recovery.signal for recovery-related settings and standby.signal to enable replication.

  30. What is pg_dumpall?

    pg_dumpall is a PostgreSQL utility that backs up the entire database cluster—including all databases, roles, and global objects—into a single dump file.

  31. What is archive_command?

    archive_command in PostgreSQL is a configuration setting that specifies the command used to archive WAL (Write-Ahead Log) files to a safe location for point-in-time recovery (PITR).

  32. How do you handle bloated tables and indexes?

    To handle bloated tables and indexes in PostgreSQL, use VACUUM to reclaim space from deleted rows. For severe bloat, run VACUUM FULL to compact the table and reclaim disk space. For bloated indexes, use REINDEX to rebuild them. Regular autovacuum helps manage bloat over time. Monitoring tools like pg_stat_user_tables and pg_stat_all_indexes can help identify and manage bloat efficiently.

  33. What are the different types of backups in PostgreSQL?
    • Logical Backup – Uses tools like pg_dump and pg_dumpall to export data in SQL or other formats for selective restoration.
    • Physical Backup – Copies database files directly using pg_basebackup or file system-level backups for full database recovery.
    • Point-In-Time Recovery (PITR) – Combines WAL archiving with a base backup to restore the database to a specific point in time.
  34. What is this wal_level parameter , different values of wal_level?

    The wal_level parameter controls the amount of information written to the Write-Ahead Log (WAL), impacting replication, backups, and point-in-time recovery (PITR).

    Different Values of wal_level
    1. minimal → Generates only enough WAL for crash recovery (no replication or PITR).
    2. replica (default) → Supports streaming replication and PITR, recording changes at the row level.
    3. logical → Includes additional details for logical replication and decoding (used for logical replication tools).
     
  35. What is visibility map in postgres?

    The Visibility Map (VM) in PostgreSQL is a lightweight data structure that tracks which pages in a table contain only frozen tuples (i.e., tuples that do not need further vacuuming). It helps autovacuum and index-only scans work efficiently by reducing the need to scan unnecessary pages.

    When a page is marked all-visible in the Visibility Map, PostgreSQL can skip scanning it during vacuum operations, improving performance. Additionally, index-only scans use the VM to determine if they can return results without accessing heap pages, making queries faster. The VM is automatically maintained by vacuum and autovacuum processes.

  36. What are statistics in PostgreSQL?

    **Statistics** in PostgreSQL are collected metadata about database objects (such as tables, indexes, and columns) that help the query planner optimize query execution by estimating the cost of different query plans. for a couple of seconds

  37. How do you list all databases and tables?

    To list all databases in PostgreSQL, use the following command:

    \l   -- or  \list
    

    To list all tables in the current database, use:

    \dt
    
  38. What is visibility map in postgres?

    Every heap relation (i.e table/index) have a visibility map associated with them.Every visibility map has 2 bits per page.The first bit, if set, indicates that the page is all-visible (means those pages need not to bevacuumed)The second bit, if set means, all tuples on this page has been frozen. (No need to vacuum) Note – > Visibility map bits are set by VACUUM operation. And if data is modified, bits will becleared.This condition helps in index only scan.

  39. What is the difference between **VACUUM**, **ANALYZE**, and **VACUUM FULL**?

    In PostgreSQL, VACUUM reclaims storage by cleaning up dead tuples without locking tables, improving performance over time. ANALYZE collects statistics on table contents to help the query planner optimize queries. VACUUM FULL, unlike regular VACUUM, reclaims space more aggressively by compacting tables and indexes, but it requires an exclusive lock on the table, making it more disruptive. Use VACUUM regularly, ANALYZE for query optimization, and VACUUM FULL for severe bloat or space reclamation.

  40. What is free space mapping (FSM) in postgres?

    Each table/index has a Free space mapping file. It keeps information about which pages are free.The VACUUM process also updates the Free Space Map and using VACUUM FULL we canrecover those free spaces.

  41. How do you perform a failover in PostgreSQL?

    To perform a failover in PostgreSQL, promote the standby to primary using pg_ctl promote or a trigger file. Update application connections to the new primary. Reconfigure the old primary as a standby if needed. Tools like Patroni, Repmgr, or Pgpool-II can automate failover for high availability.

  42. What is free space mapping(FSM) in postgres?

    The Free Space Map (FSM) in PostgreSQL is a data structure that tracks available free space in table pages. It helps PostgreSQL efficiently find pages with enough room to insert new tuples, reducing the need for table bloat and minimizing random I/O.

    FSM is automatically updated during vacuum and insert operations. When a new row is inserted, PostgreSQL checks FSM to locate a page with sufficient free space instead of always appending data to the end of the table. This improves storage efficiency and helps maintain better performance over time.

  43. How do you change a user’s password?

    To change a user’s password in PostgreSQL, use the following command:

    ALTER USER username WITH PASSWORD 'new_password';
  44. Explain Logical Replication vs Physical Replication

    Logical Replication in PostgreSQL replicates specific tables or databases by streaming changes at the SQL level, allowing selective data replication, cross-version replication, and flexible replication setups. In contrast, Physical Replication copies the entire database at the disk level using WAL files, providing exact copies of the database for high availability, but without table-level granularity. Logical replication is more flexible, while physical replication is faster and simpler for full database replication.

  45. How does partitioning work in PostgreSQL?

    Partitioning in PostgreSQL divides a large table into smaller, manageable pieces (partitions) to improve query performance and maintenance. PostgreSQL supports range, list, and hash partitioning, where data is stored in separate child tables based on predefined rules. The planner optimizes queries by scanning only relevant partitions (partition pruning), reducing I/O and execution time.

  46.  Describe the purpose of the pg_isready command in PostgreSQL.What is initial fork?
    The pg_isready command in PostgreSQL is used to check the availability and readiness of a PostgreSQL database server. It helps in monitoring database health and is commonly used in scripts and automated systems to detect if the server is accepting connections.

    pg_isready -h localhost -p 5432 -U postgres
  47. How do you enable SSL in PostgreSQL?

    To enable SSL in PostgreSQL, modify postgresql.conf by setting ssl = on and specifying certificate and key files (ssl_cert_file and ssl_key_file). Update pg_hba.conf to allow SSL connections using hostssl. Restart the PostgreSQL service for changes to take effect. Ensure the server has valid SSL certificates to establish secure connections.

  48. Explain the role of **WAL (Write-Ahead Logging)** in PostgreSQL.

    Write-Ahead Logging (WAL) in PostgreSQL ensures data durability and crash recovery by recording changes to a log before applying them to the data files. WAL allows PostgreSQL to recover from crashes by replaying logs and supports Point-In-Time Recovery (PITR) and replication. By minimizing disk writes and enabling efficient backups, WAL enhances database reliability, consistency, and performance

  49. What is TOAST in PostgreSQL?

    TOAST (The Oversized-Attribute Storage Technique) is a mechanism in PostgreSQL that stores large values efficiently by compressing and storing them out-of-line in a separate table. It is used when a row exceeds the page size limit (typically 8KB).

    When a column, like TEXT, BYTEA, or JSONB, contains a large value, PostgreSQL automatically compresses and moves it to a TOAST table, keeping only a reference in the main table. This helps reduce bloat, optimize queries, and improve performance by keeping frequently accessed data in memory while storing large values separately.

  50. Explain Row-Level Security (RLS) in PostgreSQL.

    Row-Level Security (RLS) in PostgreSQL controls access to individual rows in a table based on user roles. It is enabled with ALTER TABLE table_name ENABLE ROW LEVEL SECURITY; and policies are created using CREATE POLICY to define conditions for row access. RLS ensures that users see only the rows they are authorized to access, enhancing data security and isolation.

  51.  Explain the difference between TOAST tables and regular tables in PostgreSQL.

    In PostgreSQL, TOAST (The Oversized-Attribute Storage Technique) tables store large column values separately from the main table, while regular tables store all data within the main table heap.

  52. What is the default block size in postgres? Can we set different block size?

    The default block size in PostgreSQL is 8 KB. It is defined at compile time using the BLCKSZ parameter.You cannot change the block size for an existing database, but you can set a different block size (between 1 KB and 32 KB) by recompiling PostgreSQL from source with a custom BLCKSZ value.

  53. How does PostgreSQL handle **MVCC (Multi-Version Concurrency Control)?

    PostgreSQL uses Multi-Version Concurrency Control (MVCC) to manage concurrent transactions without blocking reads and writes. Instead of locking rows, MVCC creates multiple versions of a record, allowing transactions to access the appropriate snapshot based on their start time. Dead tuples (old versions) are later cleaned up by autovacuum.

  54. How do you detect and fix database corruption?

    To detect database corruption in PostgreSQL, check logs for errors and use tools like pg_check or pg_dump to identify issues. The pg_catalog views can also help spot inconsistencies. To fix corruption, restore from a recent backup, use pg_resetwal if needed, or rebuild corrupted indexes and tables. Regular backups and CHECKSUMS help prevent and recover from corruption efficiently.

  55. What happens in the background during vacuuming process?

    Background Process During VACUUM in PostgreSQL

    When VACUUM runs in PostgreSQL, it performs multiple background tasks to clean up dead tuples and optimize storage without blocking normal operations. It scans table pages to identify dead tuples left behind by UPDATE and DELETE operations. These tuples are then marked as reusable in the Free Space Map (FSM), allowing future inserts to reuse space efficiently.

    If AUTOVACUUM is enabled, it runs automatically, adjusting workload based on system activity. VACUUM FULL, however, rewrites the entire table, reclaiming space but requiring more resources. Additionally, ANALYZE can be triggered to update statistics, improving query planner decisions for better performance.

  56. How do you monitor and manage autovacuum in PostgreSQL?

    Use the pg_stat_progress_vacuum view and autovacuum configuration parameters to monitor and control the automatic vacuum process.

  57. What are the different **index types** in PostgreSQL?
    PostgreSQL supports several index types to optimize query performance:

    1. B-Tree Index – Default and most commonly used, ideal for range and equality queries.
    2. Hash Index – Optimized for exact matches, but less commonly used due to WAL limitations (before PostgreSQL 11).
    3. GIN (Generalized Inverted Index) – Best for full-text search and JSONB queries.
    4. GiST (Generalized Search Tree) – Used for geospatial, full-text, and complex data types.
    5. BRIN (Block Range INdex) – Efficient for large tables with sequentially correlated data.
    6. SP-GiST (Space-Partitioned GiST) – Supports hierarchical and non-balanced tree structures.

     

  58. What is the significance of search_path in postgres?

    The search_path in PostgreSQL defines the schema lookup order when executing SQL queries. It allows users to refer to objects (tables, views, functions) without specifying their schema explicitly. When a query is run, PostgreSQL searches for the object in the schemas listed in search_path, from left to right, until it finds a match.

    By default, it includes the public schema, but it can be customized to prioritize specific schemas. This helps organize database objects, avoid naming conflicts, and enhance security by restricting access to certain schemas.

  59. Difference between pg_log, pg_clog, pg_xlog?
    • pg_log → Stores database server logs, including errors, queries, and system messages. This helps in troubleshooting and monitoring database activity.
    • pg_clog (now pg_xact in newer versions) → Stores transaction commit status information, tracking whether a transaction is committed or rolled back. Essential for transaction recovery and MVCC consistency.
    • pg_xlog (now pg_wal) → Contains Write-Ahead Logs (WAL), which ensure data durability and help in crash recovery. It records all changes before they are written to disk.
     
  60.  What is ctid?

    ctid is a system column in PostgreSQL that uniquely identifies the physical location of a row within a table. It stores the block number and tuple index (offset) within the block, helping PostgreSQL quickly locate rows.

    Since ctid changes when a row is updated or moved (e.g., by VACUUM FULL), it is not a reliable primary key but is useful for efficient row lookups, indexing, and low-level operations like DELETE optimizations.

  61. How can you monitor and manage replication lag in PostgreSQL streaming replication?

    Calculate replication lag by comparing the current WAL location on the primary with the corresponding location on the standby

  62. What is oid?

    Every row in postgres will have a object identifier called oid.

  63. difference between oid and relfilenode?
    • oid (Object Identifier) → A unique identifier assigned to database objects like tables, indexes, and functions. It is stored in system catalogs (pg_class, pg_database, etc.) and remains constant unless the object is dropped and recreated.
    • relfilenode → Represents the physical file name on disk for a table or index. PostgreSQL uses relfilenode to map a relation (pg_class.relfilenode) to its actual storage file in the base/ directory. If VACUUM FULL or REINDEX is performed, the relfilenode can change, but the oid remains the same.
  64. Difference between postgres.conf and postgres.auto.conf file?

    postgres.conf is the configuration file of the postgres cluster. But when we do any config changes using alter system command, then those parameters are added in postgres.auto.conf file.
    When postgres starts , it will first read postgres.conf and then it will read postgres.auto.conf  file.

  65. What is the use of **materialized views**?

    A materialized view in PostgreSQL stores the result of a query physically on disk, allowing faster retrieval of complex queries. Unlike regular views, it does not update automatically and requires REFRESH MATERIALIZED VIEW to update data.

  66. What is effective_cache_size?

    **effective_cache_size** is a PostgreSQL configuration parameter that estimates the amount of memory available for disk caching by the operating system and database, helping the query planner optimize queries.

  67. What is work_mem?

    **work_mem** is a PostgreSQL configuration parameter that sets the amount of memory allocated for operations like sorting and hashing before writing data to disk. for a couple of seconds

  68. What is shared_buffers?

    shared_buffers is a PostgreSQL configuration parameter that determines the amount of memory allocated for caching data blocks to reduce disk I/O.

  69. What is pgAdmin in PostgreSQL?

    pgAdmin is a handy utility that comes with the PostgreSQL installation, and it lets you do regular database-related tasks through a nice graphical interface.

  70. What is timeline in postgres?

    Timeline in postgres is used to distinguish between original cluster and recovered one. When we initialize the cluster, the timelineid will be set to 1. But if database recovery happens then it will increase to 2.

  71.  What is the default port in postgres?

    Default is 5432.

  72. What is difference between pg_cancel_backend vs pg_terminate_backend?
    1. pg_cancel_backend(pid)Gracefully cancels a running query without terminating the session. It sends a SIGINT signal to the backend process, allowing it to clean up and stop execution safely. However, the session remains connected.
    2. pg_terminate_backend(pid)Forcibly kills the backend process, terminating the entire session. It sends a SIGTERM signal, closing all active transactions abruptly, which may lead to rollback.

    Key Difference

    • Use pg_cancel_backend when you want to stop a long-running query without disconnecting the session.
    • Use pg_terminate_backend when you need to forcefully disconnect a session, such as in case of blocking transactions or resource-heavy queries.
  73. What is the use of pgbench utility?

    pgbench is a benchmarking tool used to test PostgreSQL performance by simulating workloads and measuring transaction throughput. It helps in stress testing, performance tuning, and identifying bottlenecks. Users can run default or custom SQL workloads to analyze system efficiency under different loads

  74. How can you configure and manage replication slots in PostgreSQL?

    Use the pg_create_logical_replication_slot and pg_drop_replication_slot functions to manage replication slots.

  75. How can we encrypt specific columns in postgres?

    PostgreSQL does not provide built-in column-level encryption, but you can encrypt specific columns using PGCrypto or client-side encryption.

  76. Which utility is used to upgrade postgres cluster?

    pgbouncer is used for connection pooling

  77. What are the popular tools for managing backup and recovery in postgres?

    edb bart , barman etc

  78. What are the different types of replication in PostgreSQL?

    PostgreSQL supports Streaming Replication (WAL-based real-time standby), Logical Replication (table-level replication via PUBLICATION/SUBSCRIPTION), Physical Replication (disk-level data copy), Synchronous Replication (zero data loss), and Asynchronous Replication (better performance with minimal data loss risk). Tools like pglogical, Slony-I, and Bucardo offer additional replication options.

  79. Describe the purpose of the pg_rewind command in PostgreSQL.

    The pg_rewind command in PostgreSQL is used to resynchronize a standby or failed primary server with a new primary after a failover. Instead of performing a full base backup, pg_rewind efficiently rewinds the diverged instance by copying only the changed data, making it faster and reducing downtime.

    pg_rewind --target-pgdata=/var/lib/postgresql/data --source-server="host=new_primary user=postgres"
    
  80. What is an extension in postgres? Which extensions you have used ?

    pg_stat_statements

    pg_track_settings

    pg_profile

  81. How do you set up Streaming Replication?

    To set up Streaming Replication in PostgreSQL, enable WAL logging on the primary server (wal_level = replica, max_wal_senders = 3), create a replication user, and allow standby connections. On the standby server, use pg_basebackup to copy data, create standby.signal (PostgreSQL 12+), and configure primary_conninfo in postgresql.conf. Restart the standby server to start replication.

  82. How to kill a session in postgres?

    In PostgreSQL, you can terminate or cancel a session using the following functions:

    Cancel a Query (without killing the session):

    SELECT pg_cancel_backend(pid);

    This stops the running query but keeps the session active.

    Terminate a Session (forcefully kill the session):

    SELECT pg_terminate_backend(pid);
  83. How you monitor long running queries in postgres?

    we can use pg_stat_activities to track .

  84. What are **CTEs (Common Table Expressions)**, and how do they work?

    Common Table Expressions (CTEs) in PostgreSQL are temporary result sets that simplify complex queries and improve readability. Defined using the WITH clause, CTEs can be referenced multiple times within a query. They are especially useful for recursive queries and breaking down complex logic.

  85. What is the role of pg_stat_replication?

    pg_stat_replication is a PostgreSQL system view that monitors replication on the primary server, showing active standby connections, replication lag, and status. Key columns include client_addr (standby IP), state (streaming, catchup), and sent_lsn, write_lsn, replay_lsn (WAL positions). Use it to check replication health:

    SELECT pid, client_addr, state, sent_lsn, replay_lsn, sync_state FROM pg_stat_replication;
  86. What is the purpose of the pg_stat_bgwriter view in PostgreSQL?

    The pg_stat_bgwriter view in PostgreSQL provides statistics about the background writer process, which helps manage buffer writes to disk. It tracks metrics like checkpoints, buffers written, and WAL sync operations, aiding in performance tuning and monitoring database I/O behavior.

    To check bgwriter stats:

    SELECT * FROM pg_stat_bgwriter;
  87. What is table partitioning in postgres? What are the advantages?

    Table Partitioning in PostgreSQL

    Table partitioning divides a large table into smaller partitions based on criteria like range, list, or hash. The parent table is a logical structure, while actual data is stored in partitions.

    Advantages:

    • Faster Queries – Scans only relevant partitions.
    • Efficient Data Management – Easy to archive or drop old partitions.
    • Better Performance – Improves bulk inserts and deletes.
    • Optimized Indexing – Smaller and more efficient indexes.
    • Reduced Contention – Enhances concurrency by distributing load.
  88. Between postgres and nosql database like mongodb , which one is better?

    The choice between PostgreSQL and MongoDB depends on your use case:

     PostgreSQL (Relational DB) – Best for structured data, strong ACID compliance, complex queries, and transactions. Suitable for financial systems, analytics, and applications requiring strict data consistency.MongoDB (NoSQL, Document DB) – Ideal for unstructured or semi-structured data, high scalability, and flexible schemas. Best for big data, real-time apps, and fast-growing applications with schema changes.

    Which is Better?

    Choose PostgreSQL for reliable transactions, reporting, and complex queries.
    Choose MongoDB for scalability, flexibility, and high-speed document-based storage.

  89. Explain the difference between **JSON and JSONB** in PostgreSQL.
    In PostgreSQL, JSON stores data as a plain text representation, preserving formatting but requiring re-parsing for each query. JSONB, on the other hand, stores data in a binary format, allowing faster searches, indexing, and efficient storage. While JSON is better for simple storage and retrieval, JSONB is preferred for complex queries and indexing.

    CREATE TABLE data (info JSON, info_b JSONB);
    
  90. What are some key difference between oracle and postgres?

    Oracle is a proprietary, enterprise-grade relational database, while PostgreSQL is an open-source, community-driven database. Oracle offers advanced scalability, partitioning, and multi-tenant architecture, making it ideal for large enterprises and mission-critical applications. It supports PL/SQL for procedural programming, advanced replication options like Active Data Guard and GoldenGate, and enterprise-grade backup and recovery tools such as RMAN.

    PostgreSQL, on the other hand, is highly extensible, allowing users to define custom data types, functions, and extensions. It supports PL/pgSQL, along with other languages like Python and Java. PostgreSQL is preferred for modern applications due to its native JSON support, flexible partitioning, and cost-effectiveness. While Oracle provides robust enterprise solutions, PostgreSQL excels in openness, flexibility, and cost efficiency, making it a strong alternative for organizations looking to avoid high licensing costs.

  91. How do you create a new database in PostgreSQL?

    To create a new database, use the SQL command:
    sqlCopy code
    CREATE DATABASE dbname;

  92. How does partitioning work in PostgreSQL?

    Partitioning in PostgreSQL divides a large table into smaller, manageable pieces called partitions based on a specified column. It improves query performance and maintenance. PostgreSQL supports range, list, and hash partitioning.

  93. While dropping a postgres database i am getting error? What might be the issue?

    If you want to drop a database , then you need to fire the command, after connecting to a different database in the same postgres cluster with superuser privilege.

  94. What is the pg_dump method used for?

    The pg_dump method allows you to create a text file with a set of SQL commands that, when run in a PostgreSQL server, will recreate the database in the same state as it was at the moment of the dump.

  95. Explain the concept of roles and privileges in PostgreSQL.

    PostgreSQL uses roles to manage user access and permissions. A role can be a user (with LOGIN) or a group (without LOGIN). Privileges define what actions a role can perform, such as SELECT, INSERT, UPDATE, DELETE, CREATE, CONNECT, and EXECUTE. Roles are granted privileges using the GRANT command and revoked using REVOKE. This role-based access control (RBAC) system ensures secure and efficient permission management.

  96. What is the latest version of postgres in market?

    Postgres 15.

  97. What is the purpose of the pg_hba.conf file in PostgreSQL?

    The pg_hba.conf file specifies client authentication rules, controlling which hosts are allowed to connect to the PostgreSQL server and how they authenticate.

  98. How can i check the version of postgres?

    cat PG_VERSION file.

    or select pg_version();

  99.  How do you perform a backup and restore in PostgreSQL?

    Use the pg_dump command for backups and the pg_restore command for restores. Example:
    bashCopy code
    pg_dump -h localhost -U username dbname > backup.sql pg_restore -h localhost -U username -d dbname backup.sql.

  100. How pg_repack works internally?

    pg_repack reorganizes tables and indexes online without locking writes. It creates a new table, copies data efficiently while tracking changes via triggers, applies modifications, and swaps the old table with the new one. This helps reclaim space and optimize performance without downtime.

  101. What is the purpose of the pg_stat_user_tables view in PostgreSQL?

    The pg_stat_user_tables view provides statistics about table-level activity and performance

  102. Is there a way in which we can rebuild/reorg a table online to release free space?

    As we know vacuum full is used to rebuild table and it releases free space to operating system. However this method, puts an exclusive lock on the table.

  103. Describe the purpose of the pg_stat_progress_cluster view in PostgreSQL.

    The pg_stat_progress_cluster view provides information about the progress of cluster reorganization operations.

  104. what is the maximum file size of table or index in postgres? Can we increase that ?

    Max size is 1GB.  If a table size is big, then it can spread across multiple files. lets says the file_name is 19870 . and once it reaches 1gb, a new file will be created as 19870.1 .

  105. How can you implement high availability and failover in PostgreSQL?

    Use tools like Patroni or repmgr to set up and manage streaming replication and automatic failover.

  106. What are the different datatypes in postgres?

    Boolean
    char,vchar,
    int,float(n)
    uuid
    date

  107. Explain the concept of parallel queries in PostgreSQL.

    Parallel queries use multiple worker processes to execute parts of a query concurrently, improving query performance.

  108. Difference between explain and explain analyze in postgres?

    Explain – > Generates query plan by calculating the cost
    Explain analyze -> It will execute the query and provides query statistics of the executed query.  This gives more accurate plan details. Please be careful while running insert,update,delete like DML commands with explain analyze, as it will run the query and cause data changes.

  109. What is the meaning of PgAdmin?

    PgAdmin is a free open-source graphical front-end PostgreSQL database administration tool. This web-based GUI tool is prominently used to manage PostgreSQL databases. It assists in monitoring and managing numerous complex PostgreSQL and EDB database systems. PgAdmin is used to accomplish tasks like accessing, developing, and carrying out quality testing procedures.

  110. What is the full form of MVCC?

    The full form of MVCC is Multi-version Concurrency Control.

  111. Describe the purpose of the pg_xact directory in PostgreSQL.

    The pg_xact directory contains transaction status files that track the state of active transactions.

  112. What is the full form of GEQO?

    The full form of GEQO is Genetic Query Optimization. It enables non-exhaustive search to efficiently manage large join queries in PostgreSQL.

  113. What is random page cost?

    random_page_cost is a PostgreSQL parameter that defines the cost of fetching a random disk page during query execution. It influences the query planner’s decision when choosing between index scans and sequential scans. A higher value (default: 4.0) makes sequential scans more favorable, while a lower value encourages index scans. Adjusting this setting based on storage type (HDD vs. SSD) can improve query performance.

  114. what different types of streaming replications are present in postgres? And which parameters control that.

    PostgreSQL supports asynchronous, synchronous, and logical replication. Asynchronous improves performance but risks data loss, synchronous ensures zero data loss by waiting for standby confirmation, and logical replication allows selective table replication.

    Key parameters:
    wal_level (replica/logical)
    max_wal_senders (limits WAL sender processes)
    synchronous_commit
    (controls commit behavior)
    primary_conninfo (connects standby to primary
    hot_standby (enables read queries on standby)

  115.  How can you monitor and manage autocommit behavior in PostgreSQL?

    Use the autocommit configuration parameter and the BEGIN and COMMIT statements to control autocommit behavior.

  116. what is the use of share_preload_libraries in postgres.conf file?

    Usually when we add extensions like pg_stat_statement, then we need to add the library path in the parameter shared_preload_libraries.
    Because these extensions use shared memory, we need to restart the postgres cluster.
    the reason we are preloading these libraries is  to avoid the library startup time, when the library is first used.

  117. Explain the concept of logical replication in PostgreSQL.

    Logical replication replicates data changes using a publish-subscribe mechanism at the row level.

  118. What are foreign data wrappers? What is its use?

    Foreign Data Wrappers (FDW) allow PostgreSQL to access external databases as if they were local tables. They use the SQL/MED (SQL Management of External Data) standard to fetch data from sources like other PostgreSQL instances, MySQL, MongoDB, or even CSV files.

  119. Describe the purpose of the pg_resetxlog utility in PostgreSQL.

    The pg_resetxlog utility is used to reset the write-ahead log (WAL) and recover from severe corruption.

  120. Difference between role and user in postgres? Can we convert a role to user?

    Role and user are almost same in postgres, only difference is , a role cannot login , But a user can. We can say like a user is  role with login privilege.
    And yes we can convert the role to a user.
    alter role <role_name> nologin;

  121. What are the default databases created after setting up postgres cluster?

    postgres=# select datname from pg_database;
    datname
    ———–
    postgres
    template1
    template0

  122. What is the use of temporary tablespace? 

    A temporary tablespace in PostgreSQL is used to store temporary objects like temp tables, sorting operations, and hash joins. It helps improve performance by preventing temp data from consuming the main tablespace. You can assign a temporary tablespace using:

    SET temp_tablespaces = 'my_temp_tablespace';
  123. How can you implement data encryption in PostgreSQL?

    PostgreSQL supports encryption via PGCrypto for column-level encryption, SSL for secure connections, and disk-level encryption using LUKS or filesystem tools. Transparent Data Encryption (TDE) requires third-party solutions.

  124. Explain the concept of full-text search in PostgreSQL.

    Full-text search allows you to search for words or phrases within text documents stored in the database.

  125. What tools do you use for monitoring PostgreSQL performance?

    To monitor PostgreSQL performance, use pg_stat_activity to track active queries and long-running transactions, and pg_stat_statements to analyze query execution statistics. Tools like EXPLAIN ANALYZE help optimize queries, while pgAdmin provides a GUI for real-time monitoring. For advanced monitoring, tools like Prometheus with PostgreSQL Exporter, Grafana, and PGBouncer can be used to track performance metrics, connections, and query execution times.

  126. What are the tablespaces created by default after installing postgres cluster?

    After installing a PostgreSQL cluster, two default tablespaces are created:

    1. pg_default – Stores user-defined tables, indexes, and other database objects by default.
    2. pg_global – Stores shared system catalogs that are accessible across all databases in the cluster.

    These tablespaces manage database storage unless additional tablespaces are created.

  127. How do you check the current connections and terminate a session?

    To check current connections in PostgreSQL, query pg_stat_activity:

    SELECT pid, usename, datname, client_addr, application_name, state FROM pg_stat_activity;
    

    To terminate a specific session, use:

    SELECT pg_terminate_backend(pid);
    

    For immediate termination, use:

    SELECT pg_cancel_backend(pid);
  128. What are common causes of deadlocks, and how do you resolve them?

    Deadlocks in PostgreSQL occur when two or more transactions hold locks that the other transactions need, causing a cyclic wait. Common causes include transactions updating rows in different orders, long-running transactions holding locks for too long, and foreign key constraints with cascading updates/deletes. To resolve deadlocks, ensure transactions access resources in a consistent order, keep transactions short, and use appropriate indexing to reduce lock contention. Monitoring pg_stat_activity and using deadlock_timeout settings can help detect and troubleshoot deadlocks efficiently.

  129. How do you handle bloated tables and indexes?

    To handle bloated tables and indexes in PostgreSQL, use the VACUUM and ANALYZE commands to reclaim storage and update statistics. For severe bloat, run VACUUM FULL, which locks the table but compacts it effectively. Use REINDEX to rebuild bloated indexes. The pg_stat_user_tables and pg_stat_all_indexes views help identify bloat. For automated maintenance, enable autovacuum or use tools like pg_repack to reclaim space without locking tables.

  130. Describe the purpose of the pg_wal directory in PostgreSQL.

    The pg_wal directory in PostgreSQL stores Write-Ahead Logging (WAL) files, which ensure data durability and crash recovery. WAL records all changes before they are written to disk, allowing PostgreSQL to recover uncommitted transactions after a crash. It also plays a key role in replication and point-in-time recovery (PITR).

  131. How do you check for long-running queries?

    Use the following query to check long-running queries in PostgreSQL:

    SELECT pid, age(clock_timestamp(), query_start) AS duration, state, query 
    FROM pg_stat_activity 
    WHERE state = 'active' 
    ORDER BY duration DESC;
  132. What is the use of EXPLAIN ANALYZE?

    EXPLAIN ANALYZE in PostgreSQL is used to analyze and optimize query performance. It provides a detailed execution plan by showing how the database processes the query, including cost estimates, row estimates, index usage, join methods, and execution time. This helps identify inefficiencies such as sequential scans instead of index scans, slow joins, or misestimated row counts. By using EXPLAIN ANALYZE, developers can fine-tune queries by adding indexes, rewriting queries, or adjusting PostgreSQL configuration settings to improve performance.

  133. How do you perform Point-In-Time Recovery (PITR)?

    To perform Point-In-Time Recovery (PITR) in PostgreSQL, first take a physical base backup using pg_basebackup or by copying the data directory. Enable WAL archiving by setting archive_mode = on and archive_command. To restore, stop the database, replace data files with the base backup, and configure recovery settings in recovery.conf (or postgresql.auto.conf in newer versions) with restore_command and recovery_target_time. Restart PostgreSQL, and it will replay WAL logs up to the specified point, ensuring precise data recovery.

  134. What is the purpose of **pg_hba.conf**?

    The pg_hba.conf (PostgreSQL Host-Based Authentication) file controls client authentication in PostgreSQL. It defines which users can connect, from which hosts, and what authentication methods are required. Each entry specifies a connection type (local, host, hostssl, hostnossl), database, user, client IP address, and authentication method (e.g., md5, trust, peer, scram-sha-256). PostgreSQL checks this file when a client attempts to connect, ensuring secure access based on predefined rules.

  135. How do you optimize query performance in PostgreSQL?

    To optimize query performance in PostgreSQL, use EXPLAIN ANALYZE to analyze execution plans and identify bottlenecks. Create appropriate indexes (B-tree, GIN, BRIN) based on query patterns. Use VACUUM and ANALYZE to update statistics and prevent bloat. Optimize queries with proper JOINs, WHERE filters, and indexing on frequently searched columns. Avoid **SELECT ***; instead, fetch only required columns. Use partitioning for large tables and connection pooling for handling multiple queries efficiently. Tune PostgreSQL settings like work_mem, shared_buffers, and parallel_execution for better performance.

  136. How do you restore a corrupted database?

    To restore a corrupted Oracle database, first identify the issue using V$DATABASE_BLOCK_CORRUPTION and alert logs. If datafiles are missing or corrupt, restore them using RMAN: RESTORE DATABASE; RECOVER DATABASE;. For control file corruption, restore from backup or recreate it. If redo logs are corrupted, clear or drop them. Block corruption can be fixed with RMAN BLOCKRECOVER or DBMS_REPAIR. If the entire database is affected, perform a full RMAN restore and recovery. Regular backups, ARCHIVELOG mode, and proactive monitoring help prevent future corruption.

  137. What is the purpose of the pg_basebackup command?

    The pg_basebackup command is used to take a physical backup of a PostgreSQL database, including data files and WAL segments. It is commonly used for full database backups, replication setup, and Point-In-Time Recovery (PITR). This tool ensures fast, consistent backups without requiring database downtime.