Tuesday, January 12, 2010

intro postgresql

Intro to PostgreSQL

* Relational SQL database, like MySQL
* ACID compliant (like MySQL InnoDB)
o Atomicity -- Guarantees that all statements in a transaction are performed, or none of them are
o Consistency -- Database is always in a consistent state
o Isolation -- Partially completed transactions are invisible to other queries or transactions
o Durability -- Once the user is notified of success, the transaction's changes are guaranteed to be recorded permanently
* Widely considered the most advanced open source RDBMS
o Can write stored procedures in many languages -- Perl, Python, Java, TCL, PL/PgSQL, even C!
o Triggers
o Cool built in types like geometric objects and network addresses
o Can implement custom data types and operators
* Excellent, very readable documentation
* Large, helpful community


Installation
Yum is your friend. Important packages:

* postgresql-libs -- libraries, the bare minimum requirement for a client program to connect to PostgreSQL
* postgresql -- contains client programs themselves, such as the command line client, dump utility, etc
* postgresql-server -- the PostgreSQL server binaries
* postgresql-contrib -- cool additional things like data structures and utilities
* postgresql-plperl/-plpython -- supports stored procedure languages
* postgresql-devel -- needed to compile C software against PostgreSQL libraries

Assuming the server is installed, this should get it running:

# service postgresql initdb
# service postgresql start

Test login, remembering that the default authentication is 'ident'. The PostgreSQL superuser is called 'postgres'.

# su - postgres
$ psql template1
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.

The latest version is 8.3. RHEL 5 comes with PostgreSQL 8.1, which is OK. RHEL 4 comes with 7.4, which is nasty old. Never start a new PostgreSQL user with 7.4.

Fortunately an excellent YUM repository is available for RHEL 4 and 5, supporting all recent PostgreSQL versions.

http://yum.pgsqlrpms.org/

Click on "Yum Howto". Couldn't be much easier.
Configuration

Main directory: /var/lib/pgsql/data

Main configuration file is postgresql.conf

* max_connections -- number of client connections, same as MySQL
* listen_addresses -- localhost by default, change to '*' if it should be accessible to other computers
* port -- 5432 by default, the standard PostgreSQL port
* more parameters for tuning ...

Authentication configuration is in pg_hba.conf. Default configuration is "ident", meaning the system user is assumed to be the same as the database user. More on this later.

PG_VERSION contains the database major version number that must be used on this data store.
Connecting and Creating Users and Databases

On a new PostgreSQL install, you need to su - postgres before doing anything with it, because it checks that the database user is the same as the system user. System users do not automatically exist in the database. For example, as user micah:

$ psql
psql: FATAL: Ident authentication failed for user "micah"

This means the user doesn't exist. Let's create it, as user postgres:

$ createuser micah
Shall the new role be a superuser? (y/n) y

Now, as micah:

$ psql
psql: FATAL: database "micah" does not exist
$ createdb
$ psql -l
List of databases
Name | Owner | Encoding
-----------+----------+----------
postgres | postgres | UTF8
template0 | postgres | UTF8
template1 | postgres | UTF8
micah | micah | UTF8

$ psql
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.

The client utilities assume the DB name is the same as the user name, unless another DB name is presented on the command line. First, the 'micah' database did not exist, but the createdb command created it. Now we can connect.

Often it is a good idea to create users with fewer privileges. Again as the postgres user:

$ createuser bozo
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n

Demonstration of Use

Creating tables works as expected. Let's create some to play with, as user micah:

create table names (
id serial primary key,
first varchar not null,
last varchar not null
);

Table names is straightforward, except for the serial type. This is sort of like auto_increment in MySQL but with a difference. Instead of automatically choosing a number higher than the maximum, it uses a sequence. Sequences are separate entities in PostgreSQL and can be independently manipulated. Of course, if you set a sequence to a value such that it will choose a number which already exists in a primary key column, you will hit an error! Let's demonstrate inserts and sequences:

micah=# insert into names (first, last) values ('Lanham', 'Napier');
INSERT 0 1
micah=# select * from names;
id | first | last
----+--------+--------
1 | Lanham | Napier
(1 row)

micah=# select currval('names_id_seq');
currval
---------
1
(1 row)

micah=# insert into names (first, last) values ('Graham', 'Weston');
INSERT 0 1
micah=# select currval('names_id_seq');
currval
---------
2
(1 row)

micah=# select setval('names_id_seq', 1);
setval
--------
1
(1 row)

micah=# insert into names (first, last) values ('Mickey', 'Mouse');
ERROR: duplicate key value violates unique constraint "names_pkey"
micah=# select currval('names_id_seq');
currval
---------
2
(1 row)

micah=# insert into names (first, last) values ('Mickey', 'Mouse');
INSERT 0 1
micah=# select * from names;
id | first | last
----+--------+--------
1 | Lanham | Napier
2 | Graham | Weston
3 | Mickey | Mouse
(3 rows)

micah=# delete from names where id=3;
DELETE 1
micah=# insert into names (first, last) values ('Frederick', 'Mendler');
INSERT 0 1
micah=# select * from names;
id | first | last
----+-----------+---------
1 | Lanham | Napier
2 | Graham | Weston
4 | Frederick | Mendler
(3 rows)

Let's add another table to play with:

create table kudos (
id int not null references names on delete cascade,
kudo varchar
);

Here we see that the id field is a foreign key. Its value must exist in the names table. Furthermore, if the corresponding row in the names table is deleted, all corresponding kudos will also be deleted!

micah=# insert into kudos values (1, 'Great job!!!');
INSERT 0 1
micah=# insert into kudos values (2, 'Yeah!!!');
INSERT 0 1
micah=# insert into kudos values (3, 'Way to be Fanatical');
ERROR: insert or update on table "kudos" violates foreign key constraint "kudos_id_fkey"
DETAIL: Key (id)=(3) is not present in table "names".
micah=# insert into kudos values (4, 'Way to be Fanatical');
INSERT 0 1
micah=# insert into names (first, last) values ('Bozo', 'Clown');
INSERT 0 1
micah=# insert into kudos values (5, 'Nice!');
INSERT 0 1
micah=# insert into kudos values (5, 'Sweet!!!');
INSERT 0 1
micah=# select * from kudos;
id | kudo
----+---------------------
1 | Great job!!!
2 | Yeah!!!
4 | Way to be Fanatical
5 | Nice!
5 | Sweet!!!
(5 rows)

micah=# delete from names where id=5;
DELETE 1
micah=# select * from kudos;
id | kudo
----+---------------------
1 | Great job!!!
2 | Yeah!!!
4 | Way to be Fanatical
(3 rows)

And yet another table ...

create table salaries (
id int not null primary key references names on delete cascade,
salary int not null,
check ( salary > 25000 )
);

This introduces constraints.

micah=# insert into salaries values (1, 10000);
ERROR: new row for relation "salaries" violates check constraint "salaries_salary_check"
micah=# insert into salaries values (1, 1000000);
INSERT 0 1

Access and Authentication

Before we can continue, we need to discuss the various ways of authentication for users. Done in pg_hba.conf.

There are two types of rows (besides comments and blanks): host and local rows. Rows that start with "local" relate to UNIX socket connections and those that start with "host" relate to TCP/IP connections. Local rows have four fields and host rows have five:

* local or host
* database name or all
* user/role name or all
* (host connections only) CIDR address such as 127.0.0.1/32 or 192.168.1.0/24
* authentication method

The authentication method tells how it authenticates.

* ident sameuser is the default, mapping the DB user to the Unix user
* trust accepts the login, no questions asked. Convenient for testing on a local computer, but be sure you are the only one with access to it, and never do this on a publically available network interface!
* password requires a password but sends it in plain text. Don't use.
* md5 is preferrable in nearly all cases. Encrypts the password.
* crypt encrypts it with the weaker crypt() function
* pam is available to tie into any authentication type supported by PAM
* reject rejects the connection

For non-production systems with non-critical data, this seems reasonable:

# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 md5

If system has sensitive data or untrusted local users, local connections should probably use md5 also, just in case of a system compromise. Apply with:

# service postgresql reload

Roles

"Users" are actually called roles in PostgreSQL, and they can be inherited. Some simple examples:

micah=# grant select on names to bozo;
micah=# grant select,insert on kudos to bozo;

Switching to user 'bozo':

$ psql -U bozo micah
micah=> insert into names (first, last) values ('Test', 'User');
ERROR: permission denied for relation names
micah=> insert into kudos values (2, 'Is there a rating higher than Fanatical?');
INSERT 0 1

You can change the password for a role, as any superuser:

micah=> ALTER USER bozo WITH ENCRYPTED PASSWORD 'pgrocks';
$ psql -h 127.0.0.1 -U bozo micah

With that it should let you connect as bozo only if you type the right password. It will prompt here because of the 'md5' authentication in pg_hba.conf for 'host' connections. If you connect without the -h, it will still not prompt because it would be using UNIX domain socket connections, which we have configured to 'trust'.
Upgrading PostgreSQL

Minor point releases, such as 8.3.2 to 8.3.3:

# yum update
# service postgresql restart

Major releases, such as 8.2 to 8.3 (the second digit is considered major). From the aforementioned Yum repository, replace the currently installed pgdg-redhat RPM with that of the version you want to install. Then:

# su - postgres
$ mkdir pre-upgrade
$ cp data/postgresql.conf pre-upgrade
$ cp data/pg_hba.conf pre-upgrade
$ pg_dumpall > pre-upgrade/dump.sql
$ exit
# service postgresql stop
# rm -rf /var/lib/pgsql/data
# yum update
# service postgresql initdb
# service postgresql start
$ su - postgresql
$ psql template1 < pre-upgrade/dump.sql $ cp pre-upgrade/postgresql.conf data $ cp pre-upgrade/pg_hba.conf data $ exit # service postgresql restart That should minimally do it. If the database is being updated continuously, steps should be taken to prevent writes to the DB before the dump. Edit pg_hba.conf for this. Analysis Showing queries in progress: micah=# select * from pg_stat_activity; datid | datname | procpid | usesysid | usename | current_query | waiting | xact_start | query_start | backend_start | client_addr | client_port -------+---------+---------+----------+---------+---------------------------------+---------+-------------------------------+-------------------------------+-------------------------------+-------------+------------- 16386 | micah | 29598 | 16384 | micah | select * from pg_stat_activity; | f | 2008-09-19 03:24:23.858877-05 | 2008-09-19 03:24:23.858877-05 | 2008-09-19 03:24:22.290114-05 | | -1 (1 row) micah=# \x Expanded display is on. micah=# select * from pg_stat_activity; -[ RECORD 1 ]-+-------------------------------- datid | 16386 datname | micah procpid | 29598 usesysid | 16384 usename | micah current_query | select * from pg_stat_activity; waiting | f xact_start | 2008-09-19 03:25:08.762913-05 query_start | 2008-09-19 03:25:08.762913-05 backend_start | 2008-09-19 03:24:22.290114-05 client_addr | client_port | -1 We can analyze queries and increase performance by adding queries: micah=# \x Expanded display is off. micah=# select names.first, kudos.kudo from names, kudos where names.id = kudos.id; first | kudo -----------+------------------------------------------ Lanham | Great job!!! Graham | Yeah!!! Frederick | Way to be Fanatical Graham | Is there a rating higher than Fanatical? (4 rows) micah=# explain select names.first, kudos.kudo from names, kudos where names.id = kudos.id; QUERY PLAN --------------------------------------------------------------------- Hash Join (cost=28.68..67.89 rows=1230 width=64) Hash Cond: (kudos.id = names.id) -> Seq Scan on kudos (cost=0.00..22.30 rows=1230 width=36)
-> Hash (cost=18.30..18.30 rows=830 width=36)
-> Seq Scan on names (cost=0.00..18.30 rows=830 width=36)
(5 rows)

micah=# CREATE INDEX kud_idx ON kudos (id);
CREATE INDEX

micah=# explain select names.first, kudos.kudo from names, kudos where names.id = kudos.id;
QUERY PLAN
------------------------------------------------------------------
Hash Join (cost=1.09..22.54 rows=4 width=64)
Hash Cond: (names.id = kudos.id)
-> Seq Scan on names (cost=0.00..18.30 rows=830 width=36)
-> Hash (cost=1.04..1.04 rows=4 width=36)
-> Seq Scan on kudos (cost=0.00..1.04 rows=4 width=36)
(5 rows)


source : http://yoderdev.com/photos/postgres.html

No comments: