PostGIS: remove all points (but one) that are closest than “x” meters from the others

To solve this problem a number of possible solutions may come to mind, usually this are not very straightforward and/or very slow.

It turns out that the right (and spectacularly fast) solution is:

SELECT ST_GeometryN(unnest(ST_ClusterWithin(f.geom, 0.4)),1) AS geom FROM points AS f

Quick explanation:

  1. ST_ClusterWithin creates a cluster/GeometryCollection where each GeometryCollection represents a set of geometries separated by no more than the specified distance. All the GeometryCollection are placed into an array
  2. unnest splits the array and puts each element into a separate row
  3. ST_GeometryN extracts the Nth geometry from a GeometryCollection

In the example above the distance used for this analysis is 0.4 meters, and with ST_GeometryN we extract always the 1st geometry from the cluster/GeometryCollection of points, because for this problem a) is not important what point we keep (in the cluster) and b) the first geometry of a GeometryCollection always exist.

If is needed to know how many points are in each cluster/GeometryCollection the ST_NumGeometries can be used:

SELECT unnest(ST_ClusterWithin(f.geom, 0.4)) AS geom, ST_NumGeometries(unnest(ST_ClusterWithin(f.geom, 0.4))) AS ng
FROM dados.points AS f

A different approach, not as fast as the above, could be:

CREATE TABLE points_clusters AS
WITH temp_table AS (SELECT (ST_Dump(ST_Union(ST_Buffer(geom, 0.2, 'quad_segs=100')))).geom::geometry(POLYGON,3047) AS geom FROM points)
SELECT row_number() over() AS gid, geom AS geom FROM temp_table

then

CREATE TABLE points_without_closest AS
SELECT DISTINCT ON (geom) id_point, geom FROM (
SELECT a.gid AS id_point,b.gid AS id_buffer,a.geom AS geom, row_number() OVER (PARTITION BY b.gid ORDER BY a.gid) AS rownum FROM points a,
points_clusters b WHERE ST_Within(a.geom,b.geom))
tmp
WHERE rownum < 2;
Advertisements
PostGIS: remove all points (but one) that are closest than “x” meters from the others

Insert binaries into a PostgreSQL table as “bytea” data type

In older versions of PostgreSQL a native function to insert binaries as “bytea” data type is missing, so:

create or replace function bytea_import(p_path text, p_result out bytea)
 language plpgsql as $$
declare
 l_oid oid;
 r record;
begin
 p_result := '';
 select lo_import(p_path) into l_oid;
 for r in ( select data
 from pg_largeobject
 where loid = l_oid
 order by pageno ) loop
 p_result = p_result || r.data;
 end loop;
 perform lo_unlink(l_oid);
end;$$;

then (example of a case where a picture file name is already available in another column):

 UPDATE table SET bytea_column_name = bytea_import('/path/to/photo/' || "column_with_picture_name");
Insert binaries into a PostgreSQL table as “bytea” data type

Block/allow (specific port) access by country using UFW (Uncomplicated Firewall)

Very clear UFW tutorial:

https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-with-ufw-on-ubuntu-14-04

Download a text file with all the IP ranges for a given specific country (select CIDR as format):

https://www.ip2location.com/free/visitor-blocker

Save the (text) file and edit it to remove the first lines (the ones that are comments). After that issue a command like the following (the “insert 1” is to insert the new rules at the top of ufw rules list):

while read line; do sudo ufw insert 1 deny from $line to any port 22; done < bad_country.txt

 

Block/allow (specific port) access by country using UFW (Uncomplicated Firewall)

Some useful RSYNC parameter

1) Sinchronize only files younger/older than “x” days (31 ij the following example):

find /path/to/folder -type f -mtime -31 > /tmp/rsyncfiles

or

find /path/to/folder -type f -mtime +31 > /tmp/rsyncfiles

then

rsync -Ravh --files-from=/tmp/rsyncfiles . /target/folder

2) Ignore existing files

rsync --ignore-existing -raz /source/folder/ /target/folder/

3) Update the remote only if a newer version is on the local filesystem

rsync --update -raz /source/folder/ /target/folder/
Some useful RSYNC parameter

Some example of useful PostgreSQL/PostGIS queries

List database triggers:

SELECT * FROM information_schema.triggers;

List tables owned by a specific user:

SELECT *
FROM pg_tables t
WHERE t.tableowner = 'username';

Change permissions on all tables in the same schema:

REVOKE ALL ON ALL TABLES IN SCHEMA schemaname FROM username;

Loop schema names and run a whatever query:

CREATE OR REPLACE FUNCTION change_all()
 RETURNS VOID AS
 $$
 DECLARE rec RECORD;
 BEGIN
-- Get all the schemas
 FOR rec IN
 SELECT DISTINCT schemaname
 FROM pg_catalog.pg_tables
 -- You can exclude the schema which you don't want to drop by adding another condition here
 WHERE schemaname not like 'pg_catalog'
 LOOP
 EXECUTE 'GRANT ALL ON SCHEMA ' || rec.schemaname || ' TO username';
 END LOOP;
 RETURN;
 END;
 $$ LANGUAGE plpgsql;

SELECT change_all();

Add a sequence to an (editable) view:

ALTER VIEW schema.tablename ALTER gid SET DEFAULT nextval('schema.tablename_gid_seq');
Function/trigger to get/save the (database) username when doing an INSERT or UPDATE

CREATE OR REPLACE FUNCTION adduser()
 RETURNS trigger AS
 $BODY$
 BEGIN
 NEW.column_to_be_filled_with_username = session_user;
 RETURN NEW;
 END;
 $BODY$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS data_adduser ON schema.tablename;
 CREATE TRIGGER data_adduser BEFORE UPDATE ON schema.tablename
 FOR EACH ROW
 WHEN (OLD.edited_column IS DISTINCT FROM NEW.edited_column)
 EXECUTE PROCEDURE adduser();

A very simple but effective way of keeping history of changes in a table (backup table structure identical to the one being backed up, plus some other columns to save some additional info):

CREATE OR REPLACE FUNCTION backup_row()
 RETURNS trigger AS
$BODY$
 BEGIN
INSERT INTO backup_table (col1, col2, editor, col3, username, edit_data, geom) 
VALUES (OLD.col1, OLD.col2, OLD.col3, session_user, current_timestamp, OLD.geom);
 RETURN NEW;
 END;
$BODY$
 LANGUAGE plpgsql VOLATILE
 COST 100;
ALTER FUNCTION backup_row()
 OWNER TO username;


CREATE TRIGGER backup_row
 BEFORE UPDATE OR DELETE
 ON schema.tablenmae
 FOR EACH ROW
 EXECUTE PROCEDURE backup_row();

Function to add some attribute/geometry validation (see more here: http://twiav-tt.blogspot.pt/2012/07/postgis-trigger-function-retrieve.html)

CREATE OR REPLACE FUNCTION update_observation()
 RETURNS trigger AS
$BODY$
 DECLARE
 vf boolean;
 BEGIN
 -- Check the description of the observation
 IF NEW.description IS NULL THEN
 RAISE EXCEPTION 'Description cannot be empty';
 END IF;

 -- Check the geometry is completely contained within the boundaries of a polygon layer
vf := (SELECT ST_Within(NEW.geom, geom) FROM boundaries WHERE ST_Within(NEW.geom, geom));
 IF vf IS NOT TRUE THEN
 RAISE EXCEPTION 'The new geometry must be completely within the boundaries of the limits layer';
 END IF;

RETURN NEW;
 END;
$BODY$
 LANGUAGE plpgsql VOLATILE
 COST 100;
ALTER FUNCTION update_observation()
 OWNER TO username;

A more advanced example on validation and keeping history of changes: http://www.opengis.ch/2018/01/08/postgresql-back-end-solution-for-quality-assurance-and-data-archive/

Some example of useful PostgreSQL/PostGIS queries