Mansour's Blog

Tuesday, April 25, 2023

Writing Log statements

 

The question of how to write log statement, and what level to use keeps on jumping into my head every while and then. Reading few articles from the net, mainly:

https://devcenter.heroku.com/articles/writing-best-practices-for-application-logs

While we may find it appealing to come up with what we believe "best practice", and put efforts laying a structured approach to how to write logs, I don't believe this is attainable nor productive. Alternatively, I find communicating the why/why not, and leaving it to the developer's judgement is a better approach, and yield more consistent thinking in the long run.

The first question we need to ask is "what are we trying to accomplish?". In my opinion, and as a software engineer, I like to make it easier to find root cause of errors in the software. But this is only me. Other's may have different usage for logs. Business analysts may use logs to decide what feature is used most, helping them making educated decisions about what feature to enhance and what to may be deprecate. Security analyst may find interesting information in how intruders try to gain access into the system.

So back to the question "What are we trying accomplish". We are trying to keep a record of internal and interesting events, that will "hopefully" assist the user to travel back in time and find an answer to some question related to the system operations. This includes a bugs, where we need to go back in history moments before an issue surfaced, and examine the state of the system at that moment.

The root cause may not always be readily available, but by collecting small evidences we may be able to solve a puzzle. 

 

Logging frameworks have more in common than we think. They all have log levels. Those levels can be dialed up or down, depending on the current needs, and without affecting the over all availability of the system. Log systems like Linux syslog, log4j, log4net, log4c, have more or less similar log levels to match the priority of the log statements.

  • FINE 
  • Trace
  • Debug
  • INFO
  • WARN
  • ERROR
  • FATAL 

 

Do and Don't:

  • Don't put log statements at the entry and exit of a method. This is useless. Any instrumentation agent can instrument those method to produce this log at runtime. These agents can be added and removed at application startup. There's no need to clutter the code.
  • Don't abuse INFO/ERROR levels.
  • INFO mostly for events like 
    • "Starting Deployment of accounting-app", 
    • "Config File found at /path/to/config.xml", 
    • "Connection to FTP://ftp.host.name:Port establish",
    • "Scheduled process XYZ started Successfully"
    • "Kernel: Loading driver web-cam.o",
    • "User admin logged in successfully",
    •  .. and so on
Generally speaking, I may use INFO to indicate starting or ending of an event (NOT a request), but nothing more.
  • Exceptions are not errors, when caught (unless you are re throwing them). They are part of the flow of the program.  Yes, they are exceptional flow. Think about them as a beautified "GOTO" statements. They transfer flow up to the caller, indicating unexpected condition occurred.
  • TRACE/DEBUG are your friends. Most systems, have their log level set to INFO by default. This means that those LEVELS will not clutter your log files with noise, making them less useful. However, we can activate them when needed. Evaluate the priority of the event you are logging when coding business logic. Very likely it will be one of these two.

  •  FATAL is definitely not for every day use. It is intended for events that may render the system unusable, and it signals higher priority action. Possible examples:
    • Connection to database refused
    • Remote server unreachable
    • No space left of disk

  • Avoid log statements inside loops. Those generate a lot of noise, and if your logging framework has latency, they may introduce delay and make the system less performant.

 

Keep in mind, more log lines does not mean more visibility. A noisy log file, that requires efforts searching and understanding traces is another effort wasting task. What you need, is to be able to locate and understand quickly the flow of the program execution that led to specific issue, with minimal efforts.

 

So what do we log and How do we keep our logs meaningful ?

Write log statements for interesting paths. For example:

 

def credit_account(sourceAccount, targetAccount, amount){

    if (sourceAccount.value < amount){

        logger.debug("Available funds in account {}: {} less than requested ", sourceAccountId, amount);

        // this is not error. Just validating the current state of the account.     

         throw new IllegalStateException("Insufficient Fund");

    }

logger.trace("Debiting account {} amount {}", sourceAccId, amount);

sourceAccount.value = sourceAccount.value - amount

logger.trace("Crediting account {} amount {}", targetAccId, amount);

targetAccount.value = targetAccount.value + amount

}

While this example is extreme, but it demonstrates that Business logic like this, is very good candidate to take log statements. We don't have to write log statement for every like of code.

Notice that, we didn't use ERROR level when fund is insufficient to complete the transaction. This is not an error. And not an INFO that we should always see when opening a log file. There's nothing indicates it is a WARN level. In my opinion, there's nothing wrong at all about this flow. It is just normal validation for business rules. However, capturing it, may be useful at some point in the future. And likely to be needed more than the following trace statements.








Thursday, June 4, 2015

Emacs SQL mode

Took me sometime to start working again with MySQL effectively. Again I had to find a decent clinet that allows me to edit a script and execute the commands from within the buffer for testing.

I looked into VIM plugin dbext, and it sucks!
I was not able to get it to easily get it to work, and after sometime finally it worked, but openning a file in VIM, and editing it (even non SQL), caused the screen to go dirty with garbage every where. C-L to redraw was not a help. I had to find something else !

So, emacs sql-mode comes with emacs 24.4.1 (on gentoo). So Let's open an sql script with emacs, then :

  • M-x sql-mode
  • M-x sql-mysql (fill in the values).
  • M-x sql-set-product (type mysql or postgresql .... etc).
  • M-x sql-set-sqli-buffer
Now, highlght the SQL statments, and C-c C-r or just C-c C-b to send the whole buffer. Enjoy !

StackOverflow: sql-set-sqli-buffer "there is no suitable sqli buffer"
Additional setup tips.

Tuesday, June 2, 2015

Database events - Part 2 - polling an events table with apache camel

The problem with approache discussed in part one is that it's good for one table. Assuming we have an ERPsystem with more than 1000+ tables, and we need t omonitor 300 of these tables. Would polling in this case be acceptable ?

The second approache still uses polling a single table. Let's call it "the events table". We will use triggers to log transactions to the events table. Each inserted record will have a boolean field "processed" indicating that it was read by apache camel . Our example is done with postgresql. However the idea is applicable to other relational database engines. The next SQL script is a simpliefied version of Audit Trigger. Save it in a file "postgresql.events.sql"

Now in order to use this script, we need to:

  • Load it
  • Create the events table
  • Monitor a table
  • Modify the java code from part 1 to read this table
  • The follwing script should handle this part.

    Now all we need to do is to update our camel route to read from this table and possibly to update the field "processed" to true if we don't want to use the stateful scheduler in quarz2.

    Database events



    Database integration is not a straight forward task. Most of the time it's done using a script or a process that is invoked on schedule. Usually you want to do something when a data is changed. This can be done easily from the source code of the process modifying the data. However, this is not always possbile as it requires access to the source code, and skills in the technology used. This leaves us with very limited options.

    In this series we discuss each method separately. We try to generalize and remain vendor neutral, however in some posts we will focus on postgresql specific functionality.

    For each solution we will do a quick comparison table based on the following criteria

    Criteria description
    Requires Polling
    Scalable for many tables
    Vendor neutral
    Platform independent
    Performance
    1. Database events - Part 1 - Polling a table with Apache Camel
    2. Database events - Part 2 - Polling an events table with apache camel

    Monday, May 11, 2015

    Database events - Part 1 - polling a table with apache camel

    From time to time, we face the need to obtain events from a table in a database. If we are using Hibernate/JPA or another ORM, then this easy, because we can modify the DAO to publish an event. But if we need to integrate with another system, then this is an issue. Apache camel, can help in this case. We can set it up to poll a table, based on a primary key, and notify us when an even happens. Here's a sample that monitors a table for inserted records. It does so by keeping the last id inserted. In the next run we select any ID that's higher, assuming the ID is numberical and sequencial. The main idea, is to use quartz2 camel component to store the state, and then access it in the next run. This case is simple, and it does not require write-access to the database (ie. to mark the processed records). And unlike other methods based on http://camel.apache.org/idempotent-consumer.html, where we filter duplicate records, this one works if the table is large.
    public class DataBaseSyncRouteBuilder extends RouteBuilder {
    
     @Override
     public void configure() throws Exception {
    
      from("quartz2://sync/myTimer?trigger.repeatInterval=5000&stateful=true")
        .routeId("myRoute")
        .choice()
        .when()
        .simple("${header.jobDetail.jobDataMap[last_id]} == null")
        .setBody(constant("select * from student order by id "))
        .otherwise()
        .setBody(
          simple("select * from student where id> ${header.jobDetail.jobDataMap[last_id]} order by id"))
        //
        .end()
        .to("jdbc:ds?useHeadersAsParameters=true")
        .choice()
        .when()
        .simple("${header.CamelJdbcRowCount} > 0")
        .process(new Processor() {
    
         @SuppressWarnings("unchecked")
         @Override
         public void process(Exchange exchange) throws Exception {
    
          Message msg = exchange.getIn();
          List> data = msg.getBody(List.class);
          JobDetail jobDetail = (JobDetail) msg.getHeader("jobDetail");
          JobDataMap map = jobDetail.getJobDataMap();
    
          int currentId = 0;
    
          if (map.containsKey("last_id"))
           currentId = map.getInt("last_id");
    
          for (Map row : data) {
           int i = Integer.parseInt(row.get("id").toString());
           if (currentId < i)
            currentId = i;
          }
          jobDetail.getJobDataMap().put("last_id", currentId);
         }
        })
        .log("last Processed Id: ${header.jobDetail.jobDataMap[last_id]}")
        .to("activemq:myQueue");
    
      from("activemq:myQueue").marshal().json().to("file:db.output");
    
     }
    }
    
    

    Saturday, October 11, 2014

    Adding an external JDK to gentoo

    Oracle keeps on making it hard to install java directly. On gentoo system if you do:
    neptune ~ # emerge oracle-jdk-bin
    
     * IMPORTANT: 6 news items need reading for repository 'gentoo'.
     * Use eselect news to read news items.
    
    Calculating dependencies... done!
    
    >>> Verifying ebuild manifests
    
    >>> Emerging (1 of 1) dev-java/oracle-jdk-bin-1.7.0.65
     * Fetch failed for 'dev-java/oracle-jdk-bin-1.7.0.65', Log file:
     *  '/var/tmp/portage/dev-java/oracle-jdk-bin-1.7.0.65/temp/build.log'
     *
     * Oracle requires you to download the needed files manually after
     * accepting their license through a javascript capable web browser.
     *
     * Download the following files:
     *   jdk-7u65-linux-i586.tar.gz
     * at 'http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html'
     * and move them to '/usr/portage/distfiles'
     *
    
    >>> Failed to emerge dev-java/oracle-jdk-bin-1.7.0.65, Log file:
    
    >>>  '/var/tmp/portage/dev-java/oracle-jdk-bin-1.7.0.65/temp/build.log'
    
     * Messages for package dev-java/oracle-jdk-bin-1.7.0.65:
    
     * Fetch failed for 'dev-java/oracle-jdk-bin-1.7.0.65', Log file:
     *  '/var/tmp/portage/dev-java/oracle-jdk-bin-1.7.0.65/temp/build.log'
    


    Which means you need to hit their webpage directly and download the "exact" version of the jdk, then copy it to the specified directory, to proceed with the installation. This is really pain. Alternativley you can download the JDK directly, and place it any where you want. Now to make it avaialble through "java-config", there's additional work needs to be done.
    • cd to /usr/share/java-config-2/vm
    • Create a file indicating the name and version of the jdk you want to manage through java-config. For example oracle-jdk-1.7
    • Copy the following into the file:
      VERSION="Oracle JDK 1.7.0_67" 
      JAVA_HOME="/opt/jdk1.7.0_67"
      JDK_HOME="/opt/jdk1.7.0_67"
      JAVAC="${JAVA_HOME}/bin/javac"
      PATH="${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin"
      ROOTPATH="${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin"
      LDPATH="${JAVA_HOME}/jre/lib/i386/:${JAVA_HOME}/jre/lib/i386/native_threads/:${JAVA_HOME}/jre/lib/i386/xawt/:${JAVA_HOME}/jre/lib/i386/server/"
      MANPATH="${JAVA_HOME}/man"
      PROVIDES_TYPE="JDK JRE"
      PROVIDES_VERSION="1.8"
      # Taken from sun.boot.class.path property
      BOOTCLASSPATH="${JAVA_HOME}/jre/lib/resources.jar:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/jre/lib/jsse.jar:${JAVA_HOME}/jre/lib/jce.jar:${JAVA_HOME}/jre/lib/charsets.jar"
      GENERATION="2"
      ENV_VARS="JAVA_HOME JDK_HOME JAVAC PATH ROOTPATH LDPATH MANPATH"
      VMHANDLE="oracle-jdk-1.7"
      BUILD_ONLY="FALSE"
      

      you will need to modify the path to point to your JDK.
    • cd /usr/lib/jvm/
    • Create a symbolic link to the home of you JDK. The link name should be exactely the same as in the field "VMHANDLE" in the file. For example:
      neptune jvm # ln -s /opt/jdk1.7.0_67/ oracle-jdk-1.7
    • Test it with java-config -L. You should see your new JDK installed and you can select it with "java-config -S 1", then java -version.

    Friday, June 22, 2012

    eclipse seizure - 1

    In some cases when you are trying to install a  eclipse plugin through their update manager, and you are on Linux, you may get something like:


    Cannot complete the install because one or more required items could not be found. Software currently installed: Shared profile ....... etc.

    It happened to me many times. What do to ?

    Login as root, and install the updates you need. Big chance it may work. You may try to change the permissions for the base eclipse installation, but I didn't try this yet.