# PostGIS: remove all points (but one) that are closest than “x” meters from the others

To solve this problem a number of possible solutions may come to mind, usually this are not very straightforward and/or very slow.

It turns out that the right (and spectacularly fast) solution is:

`SELECT ST_GeometryN(unnest(ST_ClusterWithin(f.geom, 0.4)),1) AS geom FROM points AS f`

Quick explanation:

1. ST_ClusterWithin creates a cluster/GeometryCollection where each GeometryCollection represents a set of geometries separated by no more than the specified distance. All the GeometryCollection are placed into an array
2. unnest splits the array and puts each element into a separate row
3. ST_GeometryN extracts the Nth geometry from a GeometryCollection

In the example above the distance used for this analysis is 0.4 meters, and with ST_GeometryN we extract always the 1st geometry from the cluster/GeometryCollection of points, because for this problem a) is not important what point we keep (in the cluster) and b) the first geometry of a GeometryCollection always exist.

If is needed to know how many points are in each cluster/GeometryCollection the ST_NumGeometries can be used:

```SELECT unnest(ST_ClusterWithin(f.geom, 0.4)) AS geom, ST_NumGeometries(unnest(ST_ClusterWithin(f.geom, 0.4))) AS ng

A different approach, not as fast as the above, could be:

```CREATE TABLE points_clusters AS
WITH temp_table AS (SELECT (ST_Dump(ST_Union(ST_Buffer(geom, 0.2, 'quad_segs=100')))).geom::geometry(POLYGON,3047) AS geom FROM points)
SELECT row_number() over() AS gid, geom AS geom FROM temp_table```

then

```CREATE TABLE points_without_closest AS
SELECT DISTINCT ON (geom) id_point, geom FROM (
SELECT a.gid AS id_point,b.gid AS id_buffer,a.geom AS geom, row_number() OVER (PARTITION BY b.gid ORDER BY a.gid) AS rownum FROM points a,
points_clusters b WHERE ST_Within(a.geom,b.geom))
tmp
WHERE rownum < 2;```

# Insert binaries into a PostgreSQL table as “bytea” data type

In older versions of PostgreSQL a native function to insert binaries as “bytea” data type is missing, so:

```create or replace function bytea_import(p_path text, p_result out bytea)
language plpgsql as \$\$
declare
l_oid oid;
r record;
begin
p_result := '';
select lo_import(p_path) into l_oid;
for r in ( select data
from pg_largeobject
where loid = l_oid
order by pageno ) loop
p_result = p_result || r.data;
end loop;
end;\$\$;```

then (example of a case where a picture file name is already available in another column):

` UPDATE table SET bytea_column_name = bytea_import('/path/to/photo/' || "column_with_picture_name");`

# Block/allow (specific port) access by country using UFW (Uncomplicated Firewall)

Very clear UFW tutorial:

https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-with-ufw-on-ubuntu-14-04

Download a text file with all the IP ranges for a given specific country (select CIDR as format):

https://www.ip2location.com/free/visitor-blocker

Save the (text) file and edit it to remove the first lines (the ones that are comments). After that issue a command like the following (the “insert 1” is to insert the new rules at the top of ufw rules list):

`while read line; do sudo ufw insert 1 deny from \$line to any port 22; done < bad_country.txt`

# Some useful RSYNC parameter

1) Sinchronize only files younger/older than “x” days (31 ij the following example):

`find /path/to/folder -type f -mtime -31 > /tmp/rsyncfiles`

or

`find /path/to/folder -type f -mtime +31 > /tmp/rsyncfiles`

then

`rsync -Ravh --files-from=/tmp/rsyncfiles . /target/folder`

2) Ignore existing files

`rsync --ignore-existing -raz /source/folder/ /target/folder/`

3) Update the remote only if a newer version is on the local filesystem

`rsync --update -raz /source/folder/ /target/folder/`

# Simple versioning with rsync

`rsync -rv --backup --suffix=`date +'.%F_%H-%M'` --include='*.csv' /source/folder/ /target/folder/`

# Some example of useful PostgreSQL/PostGIS queries

List database triggers:

`SELECT * FROM information_schema.triggers;`

List tables owned by a specific user:

```SELECT *
FROM pg_tables t

Change permissions on all tables in the same schema:

`REVOKE ALL ON ALL TABLES IN SCHEMA schemaname FROM username;`

Loop schema names and run a whatever query:

```CREATE OR REPLACE FUNCTION change_all()
RETURNS VOID AS
\$\$
DECLARE rec RECORD;
BEGIN
-- Get all the schemas
FOR rec IN
SELECT DISTINCT schemaname
FROM pg_catalog.pg_tables
-- You can exclude the schema which you don't want to drop by adding another condition here
WHERE schemaname not like 'pg_catalog'
LOOP
EXECUTE 'GRANT ALL ON SCHEMA ' || rec.schemaname || ' TO username';
END LOOP;
RETURN;
END;
\$\$ LANGUAGE plpgsql;

SELECT change_all();```

Add a sequence to an (editable) view:

`ALTER VIEW schema.tablename ALTER gid SET DEFAULT nextval('schema.tablename_gid_seq');`
```Function/trigger to get/save the (database) username when doing an INSERT or UPDATE

RETURNS trigger AS
\$BODY\$
BEGIN
RETURN NEW;
END;
\$BODY\$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS data_adduser ON schema.tablename;
CREATE TRIGGER data_adduser BEFORE UPDATE ON schema.tablename
FOR EACH ROW
WHEN (OLD.edited_column IS DISTINCT FROM NEW.edited_column)

A very simple but effective way of keeping history of changes in a table (backup table structure identical to the one being backed up, plus some other columns to save some additional info):

```CREATE OR REPLACE FUNCTION backup_row()
RETURNS trigger AS
\$BODY\$
BEGIN
INSERT INTO backup_table (col1, col2, editor, col3, username, edit_data, geom)
VALUES (OLD.col1, OLD.col2, OLD.col3, session_user, current_timestamp, OLD.geom);
RETURN NEW;
END;
\$BODY\$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION backup_row()

CREATE TRIGGER backup_row
BEFORE UPDATE OR DELETE
ON schema.tablenmae
FOR EACH ROW
EXECUTE PROCEDURE backup_row();```

Function to add some attribute/geometry validation (see more here: http://twiav-tt.blogspot.pt/2012/07/postgis-trigger-function-retrieve.html)

```CREATE OR REPLACE FUNCTION update_observation()
RETURNS trigger AS
\$BODY\$
DECLARE
vf boolean;
BEGIN
-- Check the description of the observation
IF NEW.description IS NULL THEN
RAISE EXCEPTION 'Description cannot be empty';
END IF;

-- Check the geometry is completely contained within the boundaries of a polygon layer
vf := (SELECT ST_Within(NEW.geom, geom) FROM boundaries WHERE ST_Within(NEW.geom, geom));
IF vf IS NOT TRUE THEN
RAISE EXCEPTION 'The new geometry must be completely within the boundaries of the limits layer';
END IF;

RETURN NEW;
END;
\$BODY\$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION update_observation()