/home/projects

FileStorm

Overview

FileStorm is a simple file storage API that can be used where the Java Content Repository is overkill, BLOBs in database are not appropriate, and FTP does not quite do the trick. FileStorm allows applications to perform CRUD operations over content through a single interface, locally or remotely.

Features and Limitations

  • Multiple implementations: file system, cached, HTTP.
  • Embeddable.
  • Easy to extend and customize.
  • Does not support transactions.

Design

FileStorm is designed as follows:

  • The Store interface specifies CRUD-like operations that are performed over content.
  • The StoreFactory class allows creating instances of the different Store implementations in the framework.

Implementations

The framework provides different implementations of the Store interface. They are explained one by one below.

FileStore

The FileStore class implements the Store interface over the file system. A base directory must be provided, under which file operations will be performed. To create an instance of FileStore, use the constructor or the StoreFactory.

The FileStore supports hierarchical naming. Therefore, names under which content is bound can be hierarchical (corresponding to file paths). Here is an example of how to use the FileStore class.

          Store store = StoreFactory.newFileStore(new File("working"), 1024);
          // input corresponds to a java.io.InputStream
          store.put("documents/file.doc", input);
        

CachingFileStore

The CachingFileStore extends the FileStore, on top of which it implements caching logic. That is: the CachingFileStore interacts with a delegate Store with which it synchronizes its state (the delegate acting as a master copy).

The caching store's content is kept in the file system (thereby using the functionality provided by the parent FileStore class). A caching timeout is provided a instantiation time: content is automatically refreshed when the timeout occurs.

Here is an example:

          Store master   = StoreFactory.newFileStore(new File("working/master"), 1024);
          Store cache    = StoreFactory.newCachingFileStore(
                             new File("working/master"), 
                             master,
                             1024, 
			     30000);
          // input corresponds to a java.io.InputStream
          master.put("documents/file.doc", input);
          input = cache.get("documents/file.doc");
          // process input...         
        

The above usage is not "standard": usually, a caching store is used for read-only operations, and the master is used directly by admin applications to perform the write operations.

HTTP File Store

The StoreServlet and StoreClient classes interact to provide a Store that is accessible over the network.

Note that the servlet can also be conveniently accessed through web browsers.

The architecture is as follows:

  • The StoreServlet wraps a FileStore, handling get/put/delete HTTP requests and translating them to operations on the wrapped instance.
  • The StoreClient implements the Store interface over an HTTP client that connects with the servlet.

The Servlet

The servlet is configured as follows (see the javadoc):

<?xml version="1.0"?>

<web-app>
  <display-name>Web App Example</display-name>

  <servlet>
    <servlet-name>testServlet</servlet-name>
    <display-name>FileStorm Test Servlet</display-name>
    <servlet-class>org.sapia.filestorm.http.StoreServlet</servlet-class>
    
    <init-param>
      <param-name>store.basedir</param-name>
      <param-value>${user.dir}/etc/servlet</param-value>      
    </init-param>
    <!-- this one is optional - defaults to 1024 -->
    <init-param>
      <param-name>store.bufsize</param-name>
      <param-value>500</param-value>      
    </init-param>
    <init-param>
      <param-name>store.response.caching.seconds</param-name>
      <param-value>30</param-value>      
    </init-param>    
    
    <init-param>
      <param-name>store.put.enabled</param-name>
      <param-value>true</param-value>      
    </init-param>      
    
    <init-param>
      <param-name>store.delete.enabled</param-name>
      <param-value>true</param-value>      
    </init-param>      
  </servlet>
	
  <servlet-mapping>
    <servlet-name>testServlet</servlet-name>
    <url-pattern>/*</url-pattern>
  </servlet-mapping>

</web-app>

The initialization parameters are passed to the constructor of the FileStore class. As was mentioned, the servlet maps get/put/delete requests to the corresponding methods on the store instance. The path information (HttpServletRequest.getPathInfo()) is interpreted as the name of the resource for which the operation should be performed. For example, given a store servlet published at http://localhost:8080/store (where store is the context path), and the following URL: http://localhost:8080/store/documents/file.doc, the name of the resource will be interpreted as being documents/file.doc.

The HTTP Client

On the client side hand, you use a StoreClient to interact with the servlet, which allows using the usual Store interface:

Store store = StoreFactory.newStoreClient("http://localhost:8080/store");
// input corresponds to a java.io.InputStream
store.put("documents/file.doc", input);

Browser Access

The servlet can be accessed through a web browser by typing the URL corresponding to the file that is desired. For example, given our above-configured servlet, we could type: http://localhost:8080/store/documents/file.doc in the browser location.

The servlet uses Sun's activation framework in order to determine the MIME content type of stored files. In addition, it should be insisted upon that GET requests at the servlet can be spared by setting a caching parameter (see the servlet's javadoc and above example configuration). The value of such a caching parameter is used to set the Cache-Control response header, which will ensure that client browsers are effectively performing caching of the downloaded files.

As a last node, since allowing access to the servlet directly from web browsers can pause security problems in the case of PUT and DELETE requests, these are disabled by default. In order to enable them, set the appropriate servlet initialization parameters (as illustrated in the web.xml given above).

Conclusion

FileStorm is a simple API that can easily be extended and used in numerous different ways:

  • Wrap a StoreClient in a CachingFileStore to improve performance on read operations
  • .
  • Use rsync for replication between distributed FileStores organized in a master-slave topology (the master is used for admin/read/write operations), the slaves are used for load-balancing of read operations
  • .
  • etc.