Showing posts with label database. Show all posts
Showing posts with label database. Show all posts

Tuesday, June 2, 2015

Database events - Part 2 - polling an events table with apache camel

The problem with approache discussed in part one is that it's good for one table. Assuming we have an ERPsystem with more than 1000+ tables, and we need t omonitor 300 of these tables. Would polling in this case be acceptable ?

The second approache still uses polling a single table. Let's call it "the events table". We will use triggers to log transactions to the events table. Each inserted record will have a boolean field "processed" indicating that it was read by apache camel . Our example is done with postgresql. However the idea is applicable to other relational database engines. The next SQL script is a simpliefied version of Audit Trigger. Save it in a file "postgresql.events.sql"

Now in order to use this script, we need to:

  • Load it
  • Create the events table
  • Monitor a table
  • Modify the java code from part 1 to read this table
  • The follwing script should handle this part.

    Now all we need to do is to update our camel route to read from this table and possibly to update the field "processed" to true if we don't want to use the stateful scheduler in quarz2.

    Database events



    Database integration is not a straight forward task. Most of the time it's done using a script or a process that is invoked on schedule. Usually you want to do something when a data is changed. This can be done easily from the source code of the process modifying the data. However, this is not always possbile as it requires access to the source code, and skills in the technology used. This leaves us with very limited options.

    In this series we discuss each method separately. We try to generalize and remain vendor neutral, however in some posts we will focus on postgresql specific functionality.

    For each solution we will do a quick comparison table based on the following criteria

    Criteria description
    Requires Polling
    Scalable for many tables
    Vendor neutral
    Platform independent
    Performance
    1. Database events - Part 1 - Polling a table with Apache Camel
    2. Database events - Part 2 - Polling an events table with apache camel

    Monday, May 11, 2015

    Database events - Part 1 - polling a table with apache camel

    From time to time, we face the need to obtain events from a table in a database. If we are using Hibernate/JPA or another ORM, then this easy, because we can modify the DAO to publish an event. But if we need to integrate with another system, then this is an issue. Apache camel, can help in this case. We can set it up to poll a table, based on a primary key, and notify us when an even happens. Here's a sample that monitors a table for inserted records. It does so by keeping the last id inserted. In the next run we select any ID that's higher, assuming the ID is numberical and sequencial. The main idea, is to use quartz2 camel component to store the state, and then access it in the next run. This case is simple, and it does not require write-access to the database (ie. to mark the processed records). And unlike other methods based on http://camel.apache.org/idempotent-consumer.html, where we filter duplicate records, this one works if the table is large.
    public class DataBaseSyncRouteBuilder extends RouteBuilder {
    
     @Override
     public void configure() throws Exception {
    
      from("quartz2://sync/myTimer?trigger.repeatInterval=5000&stateful=true")
        .routeId("myRoute")
        .choice()
        .when()
        .simple("${header.jobDetail.jobDataMap[last_id]} == null")
        .setBody(constant("select * from student order by id "))
        .otherwise()
        .setBody(
          simple("select * from student where id> ${header.jobDetail.jobDataMap[last_id]} order by id"))
        //
        .end()
        .to("jdbc:ds?useHeadersAsParameters=true")
        .choice()
        .when()
        .simple("${header.CamelJdbcRowCount} > 0")
        .process(new Processor() {
    
         @SuppressWarnings("unchecked")
         @Override
         public void process(Exchange exchange) throws Exception {
    
          Message msg = exchange.getIn();
          List> data = msg.getBody(List.class);
          JobDetail jobDetail = (JobDetail) msg.getHeader("jobDetail");
          JobDataMap map = jobDetail.getJobDataMap();
    
          int currentId = 0;
    
          if (map.containsKey("last_id"))
           currentId = map.getInt("last_id");
    
          for (Map row : data) {
           int i = Integer.parseInt(row.get("id").toString());
           if (currentId < i)
            currentId = i;
          }
          jobDetail.getJobDataMap().put("last_id", currentId);
         }
        })
        .log("last Processed Id: ${header.jobDetail.jobDataMap[last_id]}")
        .to("activemq:myQueue");
    
      from("activemq:myQueue").marshal().json().to("file:db.output");
    
     }
    }
    
    

    Monday, June 11, 2012

    Mounting database as a file

    Postgresql offers the ability to query external data. In some scenarios this can be really useful. This called SQL/MED ("SQL Management of External Data").

    One of those cool scenarios is when you need to query a file inside code as if it was a database table. To do so you need to have at least postgresql-9.1.

    We will use the simplest file format to demonstrate the idea.

    Create a CVS file:

    Year,Make,Model,Length
    1997,Ford,E350,2.34
    2000,Mercury,Cougar,2.38
     
     
    Create a postgresql server:
     
    CREATE SERVER file_server FOREIGN DATA WRAPPER "file_fdw" ;
    
    
    
    
    
    
    
    
    Now it's the time to create the table:
    
    
    
    
    CREATE FOREIGN TABLE test (
    year int,
    make text,
    model text,
    length float
    ) SERVER file_server
    OPTIONS (format 'csv', filename '/tmp/test.txt', header 'true', delimiter ',', null '');
     
    CREATE FOREIGN TABLE
    
    
    Let's try a select:
    
    
    testdb=# SELECT * from test ;                                                      
     year |  make   | model  | length 
    ------+---------+--------+--------
     1997 | Ford    | E350   |   2.34
     2000 | Mercury | Cougar |   2.38
    (2 rows)
     
    
    
    
    
    
    
    Change a value in the file, and re-run the select. You should see the changes reflected in query results.
    
    
    Additional Forgein Data Wrapper exists and they are cool. Using them, you can query 'twitter', MySql tables,
    LDAP servers.
     
    Additional documentation:
    Foreign data wrappers
    
    
    
    
     
    For those who likes this idea but don't use postgresql, here's a perl utility that may help:
    DBIx::FileSystem
    
    
    Related documents:
    http://archive09.linux.com/feature/127055 
    https://github.com/bianster/mysqlfs
    http://www.nongnu.org/libsqlfs/
    http://www.perlmonks.org/?node_id=397814