PostgreSQL Installation


Repository Setup

If you blindly type "yum install postgresql-devel", you may have trouble getting everything installed properly.
Use the postgresql repositories instead of the Centos repsositories.
Here is how you do that.

Pre-Installation

Before you install, you should make sure you have a clean slate.
First, look for other postgres installs. I found that my centos 5 comes with postgres 8.1 already installed.
I suggest that we remove these older versions before we install the newer ones.
If you really need both versions to work side by side - this wiki page won't help you.

yum list postgres*
yum uninstall postgresql
yum uninstall postgresql_XXX
yum uninstall postgresql_YYY

If you installed from source you can go the the build directory and type

make uninstall

Installation

While we developed EAS under postgresql 8.3.7, installing 8.3.7 was giving us trouble. 8.3.11 was available via YUM and once we used YUM the install went smoothly.Using YUM is the way to go.

I followed these instructions:

Once you have YUM configured to use only the postgres respositories and not the Centos repos (see above), the following steps provide a pain free installation.

yum install postgresql
yum install postgresql-devel
yum install postgresql-server
yum install postgis
yum install postgresql-contrib

Additional Links

Initializing Database

Our Oracle trained DBAs decided to intialize the DB here

/ur01/pgsql/data

Then we want to create 2 databases.

createdb mad --owner postgres
createdb sfmaps --owner postgres

At this point, you can use pg_restore.
For more on that read
http://eas.googlecode.com/svn/trunk/releases/1_0_prod/README

Links

Performance Tuning

The EAS databases (mad, sfmaps) need to support 3 sorts of activities

  • relatively high I/O because it's geospatial
  • light OLTP actvity from single django web application
  • most connections from single django web application (no connection pooling)
  • small number of simultaneous connections (estimate max of 50)
  • heavy write activity during overnight ETL processes
  • heavy read processes from geoserver to produce web maps, especially during cache seeding

The host VMs for the DB have different amounts of RAM as shown in the table below.

Here I discuss the settings in the postgresql.conf file.
The default settings are for low resource environments and will result in relatively poor performance.
These settings are my "best guess" based on some reading (see links below).

We'll need to do some monitoring in PROD to tune these values.

parameter

value
DEV

value
QA

value
PROD

links (below)

VM GB RAM

4

6

6

 

shared_buffers

1GB

1.5GB

1.5GB

1,2

work_mem

16MB

16MB

16MB

2

maintenance_work_mem

128MB

128MB

128MB

2

wal_buffers

1MB

1MB

1MB

2

checkpoint_segments

6

6

6

2

random_page_cost

2

2

2

2

seq_page_cost

1

1

1

2

effective_cache_size

2GB

3GB

3GB

1

max_connections

100

100

100

default

A restart is needed for these configurations to take affect.

Links
1 - http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm
2 - http://workshops.opengeo.org/postgis-intro/tuning.html
3 - http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
4 - http://postgis.refractions.net/docs/ch06.html

Security - Quick Start

This is my best guess on security and users - feel free to recommend alternatives.

For the time being, I have modified

  • hba.conf
  • postgresql.conf.

so as to allow the postgres user to connect to any database from any of

  • localhost
  • geoserver
  • web server

This needs to be substantially refined for production.
The following section should address this in complete detail.

Security - Production

The dataserver should be accessible only from the following:

  • web server
  • geo server
  • localhost

We will use postgres's hba.conf to control this access.

Remember that each environment (DEV, QA, PROD) will have it's own trio of severs:

  • web server
  • geoserver
  • dataserver
    See This diagram for an example.

Get the IPs for the hba.conf from Henry or Paul.
We want to be as strict as is reasonable - not as strict as possible.

Database Accounts
I know we'll want at least 2 users

  • postgres
  • eas_user

Let's discuss if there is any value in having additional users such as:

  • geoserver user
  • etl_user

Postgres User
The postgres user will be

  • used mostly by DBAs
  • be used by developers in DEV and QA
  • used in emergency cases by developers in PROD

EAS User
The eas_user is for users of the EAS application...

  • needs to be able to read and write to all tables the mad and sfmaps databases
  • should we use a group or role here?
  • to make things simple, we'll use eas_user for access from geoserver

For any new users, we'll need to set the correct access privileges.
Until now, I have been using the postgres user everywhere (sorry!).

Installation Errors

error:

  configure: error: no acceptable C compiler found in $PATH

solution:

yum install gcc-c++

error: 

configure: error: readline library not found

solution:

yum install readline-devel.x86_64

error: 

configure: error: zlib library not found

solution:

yum install zlib.x86_64 zlib-devel.x86_64

During pg_restore you may see the following:

pg_restore --host localhost --port 5432 --username postgres --dbname mad ./mad-1_0-beta-2.backup
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 734; 1255 116672554 FUNCTION st_simplifypreservetopology(geometry, double precision) postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  could not find function "topologypreservesimplify" in file "/usr/lib64/pgsql/liblwgeom.so"
    Command was: CREATE FUNCTION st_simplifypreservetopology(geometry, double precision) RETURNS geometry
    AS '$libdir/liblwgeom', 'topolo...
pg_restore: [archiver (db)] could not execute query: ERROR:  function public.st_simplifypreservetopology(geometry, double precision) does not exist
    Command was: ALTER FUNCTION public.st_simplifypreservetopology(geometry, double precision) OWNER TO postgres;
WARNING: errors ignored on restore: 2
[postgres@CentOSDB17882 ~]$

solution
Don't worry about these errors.
This is because we upgraded postgres and postgis.
Moreover, we do not use these functions.