Groovy blob and mysql

In certain occasions e.g. when you want to save scrapped html page or just a specific part of it (e.g. div with its contents) the recommendation I found is to use the blob type and save xml document as binary stream.

In groovy, this is achieved in a rather simple way:

class Document {
  int id
  String title
  String xmlContent
}

Sql sql = Sql.newInstance(
          "jdbc:mysql://localhost:3306/documents_db?useUnicode=true&characterEncoding=UTF-8&useBlobToStoreUTF8OutsideBMP=true",
          "user", "pass", "com.mysql.jdbc.Driver")

// We assume db table similar to:
sql.execute("""
    CREATE TABLE IF NOT EXISTS `document` (
      `id` bigint(32) NOT NULL AUTO_INCREMENT,
      `title` varchar(500) NOT NULL,
      `xmlContent` blob,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    """)

def saveDocuments(def docs) {
  println "Saving documents to database..."
  def documents = sql.dataSet("document")
  docs.each {Document doc ->
    documents.add(title: doc.title, xmlContent: doc.xmlContent)
  }
}

When you want to retrieve data, with Groovy this is not much complicated either:

def loadDocuments() {
  def list = []
  sql.eachRow("SELECT * FROM document") {
    Document doc = new Document(it.toRowResult())
    if (it['xmlContent'] != null) {
      doc.xmlContent = new String(it['xmlContent'])
    }
    list.add(doc)
  }
  println "Loaded ${list.size()} documents from database"
  list
}

I tried the code with mysql database but I assume that you can achieve the same with any other database.

One hint for mysql blobs and utf-8: although utf-8 charset behaves very well with other database table fields (if you add useUnicode=true and characterEncoding=UTF-8 parameters to your connection string), in order to work with utf-8  blob fields you should adjust one more parameter: useBlobToStoreUTF8OutsideBMP=true. With this parameter set, your utf8 encoded xml documents will behave well with mysql database.


Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *