PostgreSQL DBA Interview Questions
-
What is PostgreSQL?
PostgreSQL is a lightweight, free, and open-source relational database management system. PostgreSQL is used widely across regions and companies and can be used in most popular operating systems.
-
What are the benefits of PostgreSQL?
PostgreSQL excels among other SQL databases for several reasons, including:
1.Robustness that makes it suitable for all kinds of applications
2.Free and open-source
3.Security and reliability
4.Wide variety of data types
5.A big community of users worldwide. -
How connection is established in postgres?
The PostgreSQL server (formerly known as Postmaster) acts as the main supervisor process, listening on a designated TCP/IP port for incoming client connections. Upon receiving a connection request, it spawns a separate backend process to handle that session. These backend processes communicate with each other and other instance components using shared memory and semaphores, ensuring data consistency during concurrent access. Once connected, the client sends queries to its assigned backend process, which then parses, plans, and executes the query. The results are transmitted back to the client over the established connection.
-
Explain the postgres architecture ?
PostgreSQL follows a multi-process architecture, where each client connection is handled by a separate backend process. The PostgreSQL server process (previously known as Postmaster) listens on a specified TCP/IP port for incoming client requests. When a connection request arrives, it spawns a dedicated backend process to handle that session. These backend processes interact with shared memory and use inter-process communication (IPC) mechanisms like semaphores and spinlocks to ensure data consistency. The architecture consists of key components such as shared memory (used for caching and transaction management), background processes (like WAL Writer, Checkpointer, and Autovacuum Daemon), and storage structures (such as tables, indexes, and transaction logs). PostgreSQL employs MVCC (Multi-Version Concurrency Control) to allow multiple transactions to run simultaneously without blocking each other. The WAL (Write-Ahead Logging) mechanism ensures data durability by logging changes before applying them to data files. This modular and process-based architecture enhances PostgreSQL’s scalability, reliability, and performance, making it suitable for both transactional and analytical workloads.
-
What are some important background processes in postgres?
PostgreSQL has several important background processes that ensure smooth database operation:
- Checkpointer – Flushes dirty pages from shared buffers to disk at regular intervals to reduce write overhead.
- WAL Writer – Writes transaction logs (WAL – Write-Ahead Logging) to ensure data durability and crash recovery.
- Autovacuum Daemon – Cleans up dead tuples to prevent table bloat and optimize performance.
- Background Writer – Helps reduce I/O spikes by pre-writing dirty pages to disk before checkpoints occur.
- Archiver – Handles WAL file archiving when continuous archiving is enabled.
- Stats Collector – Gathers database statistics for query optimization and performance monitoring.
- Logical Replication Launcher – Manages logical replication workers for data synchronization between databases.
- WAL Receiver/Sender – Handles streaming replication between primary and standby servers.
These background processes enhance performance, data integrity, and high availability in PostgreSQL.
-
What are the memory components in postgres?
PostgreSQL has several important memory components that optimize performance and manage database operations efficiently:
- Shared Buffers – Caches frequently accessed data pages to reduce disk I/O.
- Work Memory – Allocated per query for sorting and hashing operations.
- Maintenance Work Memory – Used for maintenance tasks like vacuuming and indexing.
- WAL Buffers – Temporary storage for Write-Ahead Logging (WAL) before writing to disk.
- Effective Cache Size – An estimate of how much OS cache PostgreSQL can use for query planning.
- Temp Buffers – Session-local buffers used for temporary tables.
- Kernel Page Cache – OS-level caching that stores frequently accessed data to improve performance.
These memory components help PostgreSQL handle queries efficiently, minimize disk access, and optimize overall performance.
-
what is the maximum file size of table or index in postgres? Can we increase that ?
Max size is 1GB. If a table size is big, then it can spread across multiple files.
-
When wal writer write data to wal segement?
The WAL Writer process writes data to WAL (Write-Ahead Logging) segments periodically to ensure durability and crash recovery. It writes data when:
- WAL Buffers are Full – If the WAL buffer fills up, WAL Writer flushes data to WAL segments.
- Timeout Occurs – It writes at regular intervals (
wal_writer_delay
, default 200ms) to prevent data loss. - Transaction Commit – WAL entries are flushed when a transaction commits to ensure durability.
- Checkpoints Trigger – During a checkpoint, WAL data is written and synced to disk for consistency.
-
When bgwriter writes data to disk?
The Background Writer (bgwriter) writes dirty pages from shared buffers to disk proactively to reduce I/O spikes and improve performance. It operates independently of checkpoints and writes data at regular intervals based on bgwriter_lru_maxpages and bgwriter_lru_multiplier settings. The primary triggers for writing include buffer eviction (when new pages are needed), time-based intervals (
bgwriter_delay
), and system workload. By gradually writing pages, it prevents sudden bursts of disk I/O during checkpoints, enhancing PostgreSQL’s overall efficiency and responsiveness. -
What is restore_command?
restore_command
in PostgreSQL is a configuration setting that specifies the command used to retrieve archived WAL (Write-Ahead Log) files during recovery or replication. -
What is the significance of pg_ident.conf file?
Just like we have os authenticated db users in oracle. Here in postgres also we have similarconcept. We can provide mapping of os user and postgres db user inside pg_ident.conf file.
-
What is recovery_target_time?
recovery_target_time
is a configuration parameter in PostgreSQL that specifies the exact point in time to which the database should be recovered during point-in-time recovery. -
What is pg_dumpall?
-
What are some wal related parameter in postgrs.conf file?
max_wal_size → It defines the soft limit for the total WAL segment size. If this limit is reached, a checkpoint is triggered to free up space.
wal_keep_segments → Specifies the number of old WAL segments to retain for standby servers. (Deprecated in newer versions, replaced bywal_keep_size
.)
wal_keep_size → Defines the size (in MB or GB) of WAL files to retain for replication before they are recycled.
max_wal_senders = 10 → Sets the maximum number of WAL sender processes that handle streaming replication to standby servers. -
What is a tablespace?
In PostgreSQL, a tablespace is a storage location on disk used to store database objects like tables and indexes. It helps in managing disk usage and optimizing performance by placing data on specific storage devices.
Example of Creating a Tablespace:
-
-
Which parameters controls the behaviour of BGWriter?
The Background Writer (bgwriter) behavior in PostgreSQL is controlled by the following parameters:
- bgwriter_delay → Defines the interval (in ms) between background writer runs (default 200ms).
- bgwriter_lru_maxpages → Sets the maximum number of dirty pages written to disk per bgwriter cycle.
- bgwriter_lru_multiplier → Determines how aggressively bgwriter writes pages based on buffer usage (default 2.0).
-
Does postgres support direct i/o.?
No, PostgreSQL does not support Direct I/O natively. It relies on the operating system’s page cache for disk I/O operations, which improves performance by reducing direct disk access.
-
What is the difference between TRUNCATE and DELETE?
TRUNCATE is faster, removes all rows at once
DELETE removes rows one by one, can use WHERE clause -
What is continuous archiving?
Method of archiving WAL files continuously
Essential for PITR capability -
What is autovacuum in PostgreSQL?
Automated process that removes dead tuples and updates statistics
Essential for maintaining database performance -
How do you create and drop a database in PostgreSQL?
To create a database in PostgreSQL, use the following command:
CREATE DATABASE dbname;
To drop a database, use:
DROP DATABASE dbname;
Ensure no active connections to the database before dropping it. You can check for active connections using
pg_stat_activity
. -
How do you take a hot backup?
To take a hot backup in PostgreSQL using pg_basebackup while the database is running:
-
Enable ARCHIVE_MODE:
Setarchive_mode = on
and configurearchive_command
inpostgresql.conf
, then restart PostgreSQL. -
Run pg_basebackup:
Usepg_basebackup
to take a backup:
-
-
How do you create a user and assign privileges in PostgreSQL?
To create a user in PostgreSQL, use the
CREATE USER
command:CREATE USER username WITH PASSWORD 'password';
To assign privileges, you can grant specific permissions like SELECT, INSERT, UPDATE, etc., using the
GRANT
command. For example:GRANT ALL PRIVILEGES ON DATABASE dbname TO username;
This grants the user all privileges on the specified database. You can also grant table-specific privileges:
GRANT SELECT, INSERT ON table_name TO username;
-
How do you take a full database backup using **pg_dump**?
To take a full database backup using
pg_dump
, use the following command: -
How do you manage users and roles in PostgreSQL?
-
- Password-Based Authentication – Uses MD5, SCRAM-SHA-256, or plain passwords for user verification.
- Trust – Allows connections without authentication (used for local, trusted environments).
- Peer Authentication – Uses OS user credentials to match PostgreSQL roles (Unix/Linux only).
- Ident Authentication – Relies on an external Ident server to verify the user.
- Kerberos, LDAP, PAM, and SSPI – Supports enterprise authentication mechanisms for secure login.
- Certificate-Based Authentication – Uses SSL/TLS certificates for secure connections.
-
What is the significance of pg_ident.conf file?
The pg_ident.conf file in PostgreSQL is used for user mapping when using Ident or external authentication methods like LDAP, Kerberos, or PAM. It allows mapping OS-level or external usernames to PostgreSQL roles, enabling flexible authentication.
-
What is recovery.conf?
In PostgreSQL <12, the recovery configuration is managed using the
recovery.conf
file.
In PostgreSQL ≥12, recovery is managed usingrecovery.signal
for recovery-related settings andstandby.signal
to enable replication. -
What is pg_dumpall?
-
What is archive_command?
-
How do you handle bloated tables and indexes?
-
What are the different types of backups in PostgreSQL?
- Logical Backup – Uses tools like
pg_dump
andpg_dumpall
to export data in SQL or other formats for selective restoration. - Physical Backup – Copies database files directly using
pg_basebackup
or file system-level backups for full database recovery. - Point-In-Time Recovery (PITR) – Combines WAL archiving with a base backup to restore the database to a specific point in time.
- Logical Backup – Uses tools like
-
What is this wal_level parameter , different values of wal_level?
The
wal_level
parameter controls the amount of information written to the Write-Ahead Log (WAL), impacting replication, backups, and point-in-time recovery (PITR).Different Values of
wal_level
- minimal → Generates only enough WAL for crash recovery (no replication or PITR).
- replica (default) → Supports streaming replication and PITR, recording changes at the row level.
- logical → Includes additional details for logical replication and decoding (used for logical replication tools).
-
What is visibility map in postgres?
The Visibility Map (VM) in PostgreSQL is a lightweight data structure that tracks which pages in a table contain only frozen tuples (i.e., tuples that do not need further vacuuming). It helps autovacuum and index-only scans work efficiently by reducing the need to scan unnecessary pages.
When a page is marked all-visible in the Visibility Map, PostgreSQL can skip scanning it during vacuum operations, improving performance. Additionally, index-only scans use the VM to determine if they can return results without accessing heap pages, making queries faster. The VM is automatically maintained by vacuum and autovacuum processes.
-
What are statistics in PostgreSQL?
**Statistics** in PostgreSQL are collected metadata about database objects (such as tables, indexes, and columns) that help the query planner optimize query execution by estimating the cost of different query plans. for a couple of seconds
-
How do you list all databases and tables?
To list all databases in PostgreSQL, use the following command:
\l -- or \list
To list all tables in the current database, use:
\dt
-
What is visibility map in postgres?
Every heap relation (i.e table/index) have a visibility map associated with them.Every visibility map has 2 bits per page.The first bit, if set, indicates that the page is all-visible (means those pages need not to bevacuumed)The second bit, if set means, all tuples on this page has been frozen. (No need to vacuum) Note – > Visibility map bits are set by VACUUM operation. And if data is modified, bits will becleared.This condition helps in index only scan.
-
What is the difference between **VACUUM**, **ANALYZE**, and **VACUUM FULL**?
-
What is free space mapping (FSM) in postgres?
Each table/index has a Free space mapping file. It keeps information about which pages are free.The VACUUM process also updates the Free Space Map and using VACUUM FULL we canrecover those free spaces.
-
How do you perform a failover in PostgreSQL?
-
What is free space mapping(FSM) in postgres?
The Free Space Map (FSM) in PostgreSQL is a data structure that tracks available free space in table pages. It helps PostgreSQL efficiently find pages with enough room to insert new tuples, reducing the need for table bloat and minimizing random I/O.
FSM is automatically updated during vacuum and insert operations. When a new row is inserted, PostgreSQL checks FSM to locate a page with sufficient free space instead of always appending data to the end of the table. This improves storage efficiency and helps maintain better performance over time.
-
-
Explain Logical Replication vs Physical Replication
-
How does partitioning work in PostgreSQL?
Partitioning in PostgreSQL divides a large table into smaller, manageable pieces (partitions) to improve query performance and maintenance. PostgreSQL supports range, list, and hash partitioning, where data is stored in separate child tables based on predefined rules. The planner optimizes queries by scanning only relevant partitions (partition pruning), reducing I/O and execution time.
- Describe the purpose of the pg_isready command in PostgreSQL.What is initial fork?
Thepg_isready
command in PostgreSQL is used to check the availability and readiness of a PostgreSQL database server. It helps in monitoring database health and is commonly used in scripts and automated systems to detect if the server is accepting connections. -
How do you enable SSL in PostgreSQL?
-
Explain the role of **WAL (Write-Ahead Logging)** in PostgreSQL.
-
What is TOAST in PostgreSQL?
TOAST (The Oversized-Attribute Storage Technique) is a mechanism in PostgreSQL that stores large values efficiently by compressing and storing them out-of-line in a separate table. It is used when a row exceeds the page size limit (typically 8KB).
When a column, like TEXT, BYTEA, or JSONB, contains a large value, PostgreSQL automatically compresses and moves it to a TOAST table, keeping only a reference in the main table. This helps reduce bloat, optimize queries, and improve performance by keeping frequently accessed data in memory while storing large values separately.
-
Explain Row-Level Security (RLS) in PostgreSQL.
-
Explain the difference between TOAST tables and regular tables in PostgreSQL.
In PostgreSQL, TOAST (The Oversized-Attribute Storage Technique) tables store large column values separately from the main table, while regular tables store all data within the main table heap.
-
What is the default block size in postgres? Can we set different block size?
The default block size in PostgreSQL is 8 KB. It is defined at compile time using the
BLCKSZ
parameter.You cannot change the block size for an existing database, but you can set a different block size (between 1 KB and 32 KB) by recompiling PostgreSQL from source with a customBLCKSZ
value. -
How does PostgreSQL handle **MVCC (Multi-Version Concurrency Control)?
PostgreSQL uses Multi-Version Concurrency Control (MVCC) to manage concurrent transactions without blocking reads and writes. Instead of locking rows, MVCC creates multiple versions of a record, allowing transactions to access the appropriate snapshot based on their start time. Dead tuples (old versions) are later cleaned up by autovacuum.
-
How do you detect and fix database corruption?
-
What happens in the background during vacuuming process?
Background Process During VACUUM in PostgreSQL
When VACUUM runs in PostgreSQL, it performs multiple background tasks to clean up dead tuples and optimize storage without blocking normal operations. It scans table pages to identify dead tuples left behind by UPDATE and DELETE operations. These tuples are then marked as reusable in the Free Space Map (FSM), allowing future inserts to reuse space efficiently.
If AUTOVACUUM is enabled, it runs automatically, adjusting workload based on system activity. VACUUM FULL, however, rewrites the entire table, reclaiming space but requiring more resources. Additionally, ANALYZE can be triggered to update statistics, improving query planner decisions for better performance.
-
How do you monitor and manage autovacuum in PostgreSQL?
Use the pg_stat_progress_vacuum view and autovacuum configuration parameters to monitor and control the automatic vacuum process.
- What are the different **index types** in PostgreSQL?
PostgreSQL supports several index types to optimize query performance:- B-Tree Index – Default and most commonly used, ideal for range and equality queries.
- Hash Index – Optimized for exact matches, but less commonly used due to WAL limitations (before PostgreSQL 11).
- GIN (Generalized Inverted Index) – Best for full-text search and JSONB queries.
- GiST (Generalized Search Tree) – Used for geospatial, full-text, and complex data types.
- BRIN (Block Range INdex) – Efficient for large tables with sequentially correlated data.
- SP-GiST (Space-Partitioned GiST) – Supports hierarchical and non-balanced tree structures.
-
What is the significance of search_path in postgres?
The
search_path
in PostgreSQL defines the schema lookup order when executing SQL queries. It allows users to refer to objects (tables, views, functions) without specifying their schema explicitly. When a query is run, PostgreSQL searches for the object in the schemas listed insearch_path
, from left to right, until it finds a match.By default, it includes the
public
schema, but it can be customized to prioritize specific schemas. This helps organize database objects, avoid naming conflicts, and enhance security by restricting access to certain schemas. -
Difference between pg_log, pg_clog, pg_xlog?
pg_log
→ Stores database server logs, including errors, queries, and system messages. This helps in troubleshooting and monitoring database activity.pg_clog
(nowpg_xact
in newer versions) → Stores transaction commit status information, tracking whether a transaction is committed or rolled back. Essential for transaction recovery and MVCC consistency.pg_xlog
(nowpg_wal
) → Contains Write-Ahead Logs (WAL), which ensure data durability and help in crash recovery. It records all changes before they are written to disk.
-
What is ctid?
ctid
is a system column in PostgreSQL that uniquely identifies the physical location of a row within a table. It stores the block number and tuple index (offset) within the block, helping PostgreSQL quickly locate rows.Since
ctid
changes when a row is updated or moved (e.g., byVACUUM FULL
), it is not a reliable primary key but is useful for efficient row lookups, indexing, and low-level operations like DELETE optimizations. -
How can you monitor and manage replication lag in PostgreSQL streaming replication?
Calculate replication lag by comparing the current WAL location on the primary with the corresponding location on the standby
-
What is oid?
Every row in postgres will have a object identifier called oid.
-
difference between oid and relfilenode?
oid
(Object Identifier) → A unique identifier assigned to database objects like tables, indexes, and functions. It is stored in system catalogs (pg_class
,pg_database
, etc.) and remains constant unless the object is dropped and recreated.relfilenode
→ Represents the physical file name on disk for a table or index. PostgreSQL usesrelfilenode
to map a relation (pg_class.relfilenode
) to its actual storage file in thebase/
directory. IfVACUUM FULL
orREINDEX
is performed, therelfilenode
can change, but theoid
remains the same.
-
Difference between postgres.conf and postgres.auto.conf file?
postgres.conf is the configuration file of the postgres cluster. But when we do any config changes using alter system command, then those parameters are added in postgres.auto.conf file.
When postgres starts , it will first read postgres.conf and then it will read postgres.auto.conf file. -
What is the use of **materialized views**?
A materialized view in PostgreSQL stores the result of a query physically on disk, allowing faster retrieval of complex queries. Unlike regular views, it does not update automatically and requires
REFRESH MATERIALIZED VIEW
to update data. -
What is effective_cache_size?
**effective_cache_size** is a PostgreSQL configuration parameter that estimates the amount of memory available for disk caching by the operating system and database, helping the query planner optimize queries.
-
What is work_mem?
**work_mem** is a PostgreSQL configuration parameter that sets the amount of memory allocated for operations like sorting and hashing before writing data to disk. for a couple of seconds
-
What is shared_buffers?
-
What is pgAdmin in PostgreSQL?
pgAdmin is a handy utility that comes with the PostgreSQL installation, and it lets you do regular database-related tasks through a nice graphical interface.
-
What is timeline in postgres?
Timeline in postgres is used to distinguish between original cluster and recovered one. When we initialize the cluster, the timelineid will be set to 1. But if database recovery happens then it will increase to 2.
-
What is the default port in postgres?
Default is 5432.
-
What is difference between pg_cancel_backend vs pg_terminate_backend?
pg_cancel_backend(pid)
→ Gracefully cancels a running query without terminating the session. It sends a SIGINT signal to the backend process, allowing it to clean up and stop execution safely. However, the session remains connected.pg_terminate_backend(pid)
→ Forcibly kills the backend process, terminating the entire session. It sends a SIGTERM signal, closing all active transactions abruptly, which may lead to rollback.
Key Difference
- Use
pg_cancel_backend
when you want to stop a long-running query without disconnecting the session. - Use
pg_terminate_backend
when you need to forcefully disconnect a session, such as in case of blocking transactions or resource-heavy queries.
-
What is the use of pgbench utility?
pgbench
is a benchmarking tool used to test PostgreSQL performance by simulating workloads and measuring transaction throughput. It helps in stress testing, performance tuning, and identifying bottlenecks. Users can run default or custom SQL workloads to analyze system efficiency under different loads -
How can you configure and manage replication slots in PostgreSQL?
Use the pg_create_logical_replication_slot and pg_drop_replication_slot functions to manage replication slots.
-
How can we encrypt specific columns in postgres?
PostgreSQL does not provide built-in column-level encryption, but you can encrypt specific columns using PGCrypto or client-side encryption.
-
Which utility is used to upgrade postgres cluster?
pgbouncer is used for connection pooling
-
What are the popular tools for managing backup and recovery in postgres?
edb bart , barman etc
-
What are the different types of replication in PostgreSQL?
-
Describe the purpose of the pg_rewind command in PostgreSQL.
The
pg_rewind
command in PostgreSQL is used to resynchronize a standby or failed primary server with a new primary after a failover. Instead of performing a full base backup,pg_rewind
efficiently rewinds the diverged instance by copying only the changed data, making it faster and reducing downtime. -
What is an extension in postgres? Which extensions you have used ?
pg_stat_statements
pg_track_settings
pg_profile
-
How do you set up Streaming Replication?
-
How to kill a session in postgres?
In PostgreSQL, you can terminate or cancel a session using the following functions:
Cancel a Query (without killing the session):
This stops the running query but keeps the session active.
Terminate a Session (forcefully kill the session):
-
How you monitor long running queries in postgres?
we can use pg_stat_activities to track .
-
What are **CTEs (Common Table Expressions)**, and how do they work?
Common Table Expressions (CTEs) in PostgreSQL are temporary result sets that simplify complex queries and improve readability. Defined using the
WITH
clause, CTEs can be referenced multiple times within a query. They are especially useful for recursive queries and breaking down complex logic. -
What is the role of pg_stat_replication?
pg_stat_replication
is a PostgreSQL system view that monitors replication on the primary server, showing active standby connections, replication lag, and status. Key columns include client_addr (standby IP), state (streaming
,catchup
), and sent_lsn, write_lsn, replay_lsn (WAL positions). Use it to check replication health:SELECT pid, client_addr, state, sent_lsn, replay_lsn, sync_state FROM pg_stat_replication;
-
What is the purpose of the pg_stat_bgwriter view in PostgreSQL?
The pg_stat_bgwriter view in PostgreSQL provides statistics about the background writer process, which helps manage buffer writes to disk. It tracks metrics like checkpoints, buffers written, and WAL sync operations, aiding in performance tuning and monitoring database I/O behavior.
To check bgwriter stats:
-
What is table partitioning in postgres? What are the advantages?
Table Partitioning in PostgreSQL
Table partitioning divides a large table into smaller partitions based on criteria like range, list, or hash. The parent table is a logical structure, while actual data is stored in partitions.
Advantages:
- Faster Queries – Scans only relevant partitions.
- Efficient Data Management – Easy to archive or drop old partitions.
- Better Performance – Improves bulk inserts and deletes.
- Optimized Indexing – Smaller and more efficient indexes.
- Reduced Contention – Enhances concurrency by distributing load.
-
Between postgres and nosql database like mongodb , which one is better?
The choice between PostgreSQL and MongoDB depends on your use case:
PostgreSQL (Relational DB) – Best for structured data, strong ACID compliance, complex queries, and transactions. Suitable for financial systems, analytics, and applications requiring strict data consistency.MongoDB (NoSQL, Document DB) – Ideal for unstructured or semi-structured data, high scalability, and flexible schemas. Best for big data, real-time apps, and fast-growing applications with schema changes.
Which is Better?
Choose PostgreSQL for reliable transactions, reporting, and complex queries.
Choose MongoDB for scalability, flexibility, and high-speed document-based storage. - Explain the difference between **JSON and JSONB** in PostgreSQL.
In PostgreSQL, JSON stores data as a plain text representation, preserving formatting but requiring re-parsing for each query. JSONB, on the other hand, stores data in a binary format, allowing faster searches, indexing, and efficient storage. While JSON is better for simple storage and retrieval, JSONB is preferred for complex queries and indexing. -
What are some key difference between oracle and postgres?
Oracle is a proprietary, enterprise-grade relational database, while PostgreSQL is an open-source, community-driven database. Oracle offers advanced scalability, partitioning, and multi-tenant architecture, making it ideal for large enterprises and mission-critical applications. It supports PL/SQL for procedural programming, advanced replication options like Active Data Guard and GoldenGate, and enterprise-grade backup and recovery tools such as RMAN.
PostgreSQL, on the other hand, is highly extensible, allowing users to define custom data types, functions, and extensions. It supports PL/pgSQL, along with other languages like Python and Java. PostgreSQL is preferred for modern applications due to its native JSON support, flexible partitioning, and cost-effectiveness. While Oracle provides robust enterprise solutions, PostgreSQL excels in openness, flexibility, and cost efficiency, making it a strong alternative for organizations looking to avoid high licensing costs.
-
How do you create a new database in PostgreSQL?
To create a new database, use the SQL command:
sqlCopy code
CREATE DATABASE dbname; -
How does partitioning work in PostgreSQL?
Partitioning in PostgreSQL divides a large table into smaller, manageable pieces called partitions based on a specified column. It improves query performance and maintenance. PostgreSQL supports range, list, and hash partitioning.
-
While dropping a postgres database i am getting error? What might be the issue?
If you want to drop a database , then you need to fire the command, after connecting to a different database in the same postgres cluster with superuser privilege.
-
What is the pg_dump method used for?
The pg_dump method allows you to create a text file with a set of SQL commands that, when run in a PostgreSQL server, will recreate the database in the same state as it was at the moment of the dump.
-
Explain the concept of roles and privileges in PostgreSQL.
PostgreSQL uses roles to manage user access and permissions. A role can be a user (with
LOGIN
) or a group (withoutLOGIN
). Privileges define what actions a role can perform, such as SELECT, INSERT, UPDATE, DELETE, CREATE, CONNECT, and EXECUTE. Roles are granted privileges using theGRANT
command and revoked usingREVOKE
. This role-based access control (RBAC) system ensures secure and efficient permission management. -
What is the latest version of postgres in market?
Postgres 15.
-
What is the purpose of the pg_hba.conf file in PostgreSQL?
The pg_hba.conf file specifies client authentication rules, controlling which hosts are allowed to connect to the PostgreSQL server and how they authenticate.
-
How can i check the version of postgres?
cat PG_VERSION file.
or select pg_version();
-
How do you perform a backup and restore in PostgreSQL?
Use the pg_dump command for backups and the pg_restore command for restores. Example:
bashCopy code
pg_dump -h localhost -U username dbname > backup.sql pg_restore -h localhost -U username -d dbname backup.sql. -
How pg_repack works internally?
pg_repack reorganizes tables and indexes online without locking writes. It creates a new table, copies data efficiently while tracking changes via triggers, applies modifications, and swaps the old table with the new one. This helps reclaim space and optimize performance without downtime.
-
What is the purpose of the pg_stat_user_tables view in PostgreSQL?
The pg_stat_user_tables view provides statistics about table-level activity and performance
-
Is there a way in which we can rebuild/reorg a table online to release free space?
As we know vacuum full is used to rebuild table and it releases free space to operating system. However this method, puts an exclusive lock on the table.
-
Describe the purpose of the pg_stat_progress_cluster view in PostgreSQL.
The pg_stat_progress_cluster view provides information about the progress of cluster reorganization operations.
-
what is the maximum file size of table or index in postgres? Can we increase that ?
Max size is 1GB. If a table size is big, then it can spread across multiple files. lets says the file_name is 19870 . and once it reaches 1gb, a new file will be created as 19870.1 .
-
How can you implement high availability and failover in PostgreSQL?
Use tools like Patroni or repmgr to set up and manage streaming replication and automatic failover.
-
What are the different datatypes in postgres?
Boolean
char,vchar,
int,float(n)
uuid
date -
Explain the concept of parallel queries in PostgreSQL.
Parallel queries use multiple worker processes to execute parts of a query concurrently, improving query performance.
-
Difference between explain and explain analyze in postgres?
Explain – > Generates query plan by calculating the cost
Explain analyze -> It will execute the query and provides query statistics of the executed query. This gives more accurate plan details. Please be careful while running insert,update,delete like DML commands with explain analyze, as it will run the query and cause data changes. -
What is the meaning of PgAdmin?
PgAdmin is a free open-source graphical front-end PostgreSQL database administration tool. This web-based GUI tool is prominently used to manage PostgreSQL databases. It assists in monitoring and managing numerous complex PostgreSQL and EDB database systems. PgAdmin is used to accomplish tasks like accessing, developing, and carrying out quality testing procedures.
-
What is the full form of MVCC?
The full form of MVCC is Multi-version Concurrency Control.
-
Describe the purpose of the pg_xact directory in PostgreSQL.
The pg_xact directory contains transaction status files that track the state of active transactions.
-
What is the full form of GEQO?
The full form of GEQO is Genetic Query Optimization. It enables non-exhaustive search to efficiently manage large join queries in PostgreSQL.
-
What is random page cost?
-
what different types of streaming replications are present in postgres? And which parameters control that.
PostgreSQL supports asynchronous, synchronous, and logical replication. Asynchronous improves performance but risks data loss, synchronous ensures zero data loss by waiting for standby confirmation, and logical replication allows selective table replication.
Key parameters:
wal_level
(replica/logical)
max_wal_senders
(limits WAL sender processes)
(controls commit behavior)
synchronous_commit
primary_conninfo
(connects standby to primary
hot_standby
(enables read queries on standby) -
How can you monitor and manage autocommit behavior in PostgreSQL?
Use the autocommit configuration parameter and the BEGIN and COMMIT statements to control autocommit behavior.
-
what is the use of share_preload_libraries in postgres.conf file?
Usually when we add extensions like pg_stat_statement, then we need to add the library path in the parameter shared_preload_libraries.
Because these extensions use shared memory, we need to restart the postgres cluster.
the reason we are preloading these libraries is to avoid the library startup time, when the library is first used. -
Explain the concept of logical replication in PostgreSQL.
Logical replication replicates data changes using a publish-subscribe mechanism at the row level.
-
What are foreign data wrappers? What is its use?
Foreign Data Wrappers (FDW) allow PostgreSQL to access external databases as if they were local tables. They use the SQL/MED (SQL Management of External Data) standard to fetch data from sources like other PostgreSQL instances, MySQL, MongoDB, or even CSV files.
-
Describe the purpose of the pg_resetxlog utility in PostgreSQL.
The pg_resetxlog utility is used to reset the write-ahead log (WAL) and recover from severe corruption.
-
Difference between role and user in postgres? Can we convert a role to user?
Role and user are almost same in postgres, only difference is , a role cannot login , But a user can. We can say like a user is role with login privilege.
And yes we can convert the role to a user.
alter role <role_name> nologin; -
What are the default databases created after setting up postgres cluster?
postgres=# select datname from pg_database;
datname
———–
postgres
template1
template0 -
What is the use of temporary tablespace?
A temporary tablespace in PostgreSQL is used to store temporary objects like temp tables, sorting operations, and hash joins. It helps improve performance by preventing temp data from consuming the main tablespace. You can assign a temporary tablespace using:
-
How can you implement data encryption in PostgreSQL?
PostgreSQL supports encryption via PGCrypto for column-level encryption, SSL for secure connections, and disk-level encryption using LUKS or filesystem tools. Transparent Data Encryption (TDE) requires third-party solutions.
-
Explain the concept of full-text search in PostgreSQL.
Full-text search allows you to search for words or phrases within text documents stored in the database.
-
What tools do you use for monitoring PostgreSQL performance?
-
What are the tablespaces created by default after installing postgres cluster?
After installing a PostgreSQL cluster, two default tablespaces are created:
- pg_default – Stores user-defined tables, indexes, and other database objects by default.
- pg_global – Stores shared system catalogs that are accessible across all databases in the cluster.
These tablespaces manage database storage unless additional tablespaces are created.
-
How do you check the current connections and terminate a session?
To check current connections in PostgreSQL, query
pg_stat_activity
:To terminate a specific session, use:
For immediate termination, use:
-
What are common causes of deadlocks, and how do you resolve them?
-
How do you handle bloated tables and indexes?
-
Describe the purpose of the pg_wal directory in PostgreSQL.
The pg_wal directory in PostgreSQL stores Write-Ahead Logging (WAL) files, which ensure data durability and crash recovery. WAL records all changes before they are written to disk, allowing PostgreSQL to recover uncommitted transactions after a crash. It also plays a key role in replication and point-in-time recovery (PITR).
-
How do you check for long-running queries?
Use the following query to check long-running queries in PostgreSQL:
SELECT pid, age(clock_timestamp(), query_start) AS duration, state, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC;
-
What is the use of EXPLAIN ANALYZE?
-
How do you perform Point-In-Time Recovery (PITR)?
To perform Point-In-Time Recovery (PITR) in PostgreSQL, first take a physical base backup using
pg_basebackup
or by copying the data directory. Enable WAL archiving by settingarchive_mode = on
andarchive_command
. To restore, stop the database, replace data files with the base backup, and configure recovery settings inrecovery.conf
(orpostgresql.auto.conf
in newer versions) withrestore_command
andrecovery_target_time
. Restart PostgreSQL, and it will replay WAL logs up to the specified point, ensuring precise data recovery. -
What is the purpose of **pg_hba.conf**?
-
How do you optimize query performance in PostgreSQL?
-
How do you restore a corrupted database?
-
What is the purpose of the pg_basebackup command?