Simran Jindal's Blog

my effort to understand what the "it depends" really depends upon

Denali CTP 3 – How to Set Up the FileTable and access the unstructured data from SQL Server

with 3 comments

One of the latest features of SQL Server Denali or SQL Server 2011 is the FileTable feature. Quite a self explanatory name.  It’s a database table that stores file system data OR a table of files, an advancement that builds on FILESTREAM technology. The feature is great for applications that require file and directory storage in the database with Windows API compatibility and non transactional access.

So what  is a  FILETABLE really ?

  • It is a specialized user table with pre-defined schema that stores FILESTREAM data, as well as file and directory hierarchy information and file attributes.
  • The hierarchy starts from the root directory (windows share) that is specified at the time of creating the FileTable.  So when you add or remove files or directories from this windows share, the entire directory structure with the FILESTREAM data gets added to the FileTable.
  • Each row in FileTable represents a file or a directory or each file or directory in the windows share represents a row in the FileTable.
  • The directory hierarchy surfaced through the share is a purely logical directory structure that is maintained within the FileTable which means that for a Windows API based application or a user managing the shared folder, it is a normal windows folder except that  the files are getting read out of the database.
  • Besides FILESTREAM data and path to the file or  folder , ten additional file attributes like creation_time, last_write_time, last_access_time, is_directory, is_offline, is_hidden,  is_readonly, is_archive, is_system, is_temporary are stored for each file or directory as applicable. See FileTable Schema for details.
  • A file_type column, which is a persisted computed column, is populated to store the file type to support full-text search and semantic search.
  • Calls to create or change a file or a directory through windows share is intercepted by a SQL Server  component and reflected in the corresponding FileTable.

Having said all of that there are certain steps to be followed before we can start enjoying this cool feature.

Pre-requisites to creating FileTables

As stated earlier, FileTables are based on the FILESTREAM technology, so FILESTREAM needs to be enabled at the instance level as the first step.

To do that go to SQL Server Configuration Manager (Start -> All Programs -> Microsoft SQL Server Code-Named “Denali” ->  Configuration Tools), in the list of services,  right-click SQL Server Services and click Open.  In the list of SQL Server instance(s) (in case you have muliple instances installed) select the instance on which you want to enable FILESTREAM. Select the FILESTREAM tab, select the Enable FILESTREAM for Transact SQL access, if you want to read and write FILESTREAM data from Windows, select Enable FILESTREAM for file I/O streaming access.  Enter the name of the winodws share in the Winodws Share Name box. Click OK.

Go to SQL Server  Management  Studio, connect to SQL Server Denali instance and open a new query window.  In the query editor enter and execute the following code.
EXEC sp_configure filestream_access_level, 2
RECONFIGURE
Go

The value ’2′ here enables the FILESTREAM for both Transact-SQL and Win32 streaming access.
The next step is to enable non-transactional access at the database level. Why do we need to do this?  As stated earlier, that the file structure is logical and the data is actually being read of the database. To be able to access the windows share directly or through Windows API based applications, FileTables let Windows applications obtain a Windows file handle to FILESTREAM data without requiring a transaction or in other words, we are accessing table data outside of the RDBMS. To allow this non-transactional access to unstructured data, we need to specify the level of non-transactional access at the database level for each database that contains or will contain FileTables. Before enabling the non-transactional access let’s create a test database with a FILESTREAM filegroup on the Denali CTP 3 instance.
--go to master
USE MASTER
GO

--check if Database already exists
IF DB_ID ('FileTableTestDB') IS NOT NULL
BEGIN
	DROP DATABASE FileTableTestDB
END
GO
--Create the test database
CREATE DATABASE FileTableTestDB
ON
PRIMARY
(
    NAME = FileTableTestDB,
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL11.DENALICTP3\MSSQL\DATA\FileTableTestDB.mdf'
),
FILEGROUP FileTableTestDBFS CONTAINS FILESTREAM
(
    NAME = FileTableTestDBFS,
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL11.DENALICTP3\MSSQL\DATA\FileTableTestDB'
)
LOG ON
(
    NAME = FileTableTestDBLog,
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL11.DENALICTP3\MSSQL\DATA\FileTableTestDB.ldf'
)
GO
--go to the test database
USE FileTableTestDB
GO

--The sys.database_files should show you three rows, one for data file, one for log file and one for FILESTREAM
SELECT * FROM sys.database_files WHERE type_desc = 'FILESTREAM'
Go

--go back to master
USE MASTER
GO

--Alter the FileTableTestDB and set the non transaction access level and directory name
ALTER DATABASE FileTableTestDB
    SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = N'FileTableTestDB')
GO

NON_TRANSACTED_ACCESS Specifies the level of non-transactional FILESTREAM access to the database.

It can be set to the following values:

Value Description
OFF Non-transactional access is disabled.
READONLY FILESTREAM data in this database can be read by non-transactional processes.
FULL Full non-transactional access to FILESTREAM FileTables is enabled.

You can access the file share at \\<machine_name>\DENALICTP3, which is the root folder for the instance and under this folder you will see FileTableTestDB folder which is the root folder for FileTableTestDB database. OR you can query the catalog view sys.database_filestream_options.

SELECT
	DB_NAME ( database_id ),
	directory_name
FROM
	sys.database_filestream_options;

Now let’s create the FileTable

Creating FileTable

--go back to FileTableTestDB
USE FileTableTestDB
GO

--Create the FileTable
CREATE TABLE tblFileTableTest AS FileTable
WITH
(
	    FileTable_Directory = 'FileTableTestDB',
	    FileTable_Collate_Filename = database_default
)
GO

Notice that I have used the folder name same as the FileTable table name for simplicity sake because in my test I am going to have only one FileTable. But in a real life scenario where we might have multiple FileTables for each business unit or business process then it is recommeded to use appropriate table names and FileTable_Directory names. Each FileTable_Directory gets created under the root folder mapped to the database. If you create the FILETABLE without the directory and the collate filename then the name of the FileTable becomes the name of the FILETABLE_DIRECTORY, and the value of FILETABLE_COLLATE_FILENAME becomes database_default.

Now that the table is created, the real fun begins. Go to the shared folder (in my case it is \\DENALICTP3\FileTableTestDB\tblFileTableTest) using the windows share and drop some files and create a folde or two. I added a document, a pdf and created a sub-folder with three image files. And this is how my FileTable and Windows share look like:

Notice, that  each file or folder is a row in the table as stated before, the file_stream is NULL for “Images” folder. Also, notice parent_path_locator for all the image files is the same as the path_locator for the “Images” folder.

Play around a little more and add or delete files or folders and you will see the changes take place in the FileTable instantly.

For additional ways to load files in FileTables check this link out – Load Files into FileTables

Restriction on nesting level
You cannot store more than 15 levels of subdirectories in the FileTable directory. When you store 15 levels of subdirectories, then the lowest level cannot contain files, since these files would represent an additional level.

Altering a FileTable

Since a FileTable has a pre-defined and fixed schema, you cannot add or change its columns. However, you can add custom indexes, triggers, constraints, and other options to a FileTable.  You can change the directory name for FileTable by excuting following command

ALTER TABLE tblFileTableTest SET (FILETABLE_DIRECTORY = 'FileTableTestDB_New')
GO

This command will rename the old directory name “FileTableTestDB” to “FileTableTestDB_New”.

Dropping a FileTable

You can drop a FileTable by using the DROP TABLE  command.  When a FileTable is dropped, all the associated database objects indexes, constraints and triggers are dropped. The contents of FileTable directory will disappear from the FILESTREAM file and directory hierarchy of the database.

Default database objects associated with a FileTable

It’s interesting to see all the database objects that get created with FileTable, the keys, indexes, check constraints and defaults. Run the following SQL Query to get the details on all the associated database objects that get created with the FileTable.

SELECT distinct
	'FileTable' = OBJECT_NAME(sdo.parent_object_id),
	'System-defined Object' = OBJECT_NAME(sdo.object_id),
	'Column Name' = ac.name,
	'Type'=
		CASE
			WHEN sdo.object_id = dc.object_id THEN dc.type_desc
			WHEN sdo.object_id = kc.object_id THEN kc.type_desc
			WHEN sdo.object_id = cc.object_id THEN cc.type_desc
			WHEN sdo.object_id = fk.object_id THEN fk.type_desc
			ELSE NULL
		END,
	'Constraint Definition'=
		CASE
			WHEN sdo.object_id = dc.object_id THEN dc.definition
			WHEN sdo.object_id = cc.object_id THEN cc.definition
			ELSE NULL
		END,
	'Index Type' = i.type_desc
FROM
	sys.filetable_system_defined_objects sdo left join sys.default_constraints dc on sdo.object_id = dc.object_id
	left join sys.all_columns ac on dc.parent_column_id = ac.column_id and sdo.parent_object_id = ac.object_id
	left join sys.key_constraints kc on sdo.object_id = kc.object_id
	left join sys.check_constraints cc on sdo.object_id = cc.object_id
	left join sys.foreign_keys fk on sdo.object_id = fk.object_id
	left join sys.indexes i on sdo.parent_object_id = i.object_id and kc.parent_object_id = i.object_id and kc.name = i.name
ORDER
	BY FileTable, 'System-defined Object';

Benefits of using FileTables

  • Unstructured data when becomes part of SQL Server take advantage of integrated administration features like backup, auditing etc.
  • A hierarchical namespace of directories and files.
  • Access to file attributes with simple T_SQL commands
  • Windows file and directory management API based applications can continue to work using the windows share.
  • FileTables can be queried and updated through Transact-SQL . This makes reporting on the unstructured data a breeze instead of traversing through the whole file system.
  • Changes to file system are instantly updated/reflected in the FileTables.

Imagine if you need to know the number of files, directories and unique file types or the number of files for each file type. It is as simple as running the following SQL commands:-

SELECT
	'Files' = SUM(CASE WHEN is_directory = 0 then 1 Else 0 END),
	'Directories' = SUM(CASE WHEN is_directory = 1 then 1 Else 0 END),
	'FileTypes' = COUNT(distinct file_type)
From
	tblFileTableTest

OR

SELECT
	file_type,
	count(*)
From
	tblFileTableTest
Where
	is_directory = 0
Group By
	file_type

This is a feature worth keeping an eye if you are writing custom applications to manage terabytes of data.

Written by Simran Jindal

August 30, 2011 at 10:14 am

3 Responses

Subscribe to comments with RSS.

  1. [...] enhancements - FileTables, enabling the storage of files including documents that can be accessed from Windows applications, [...]

  2. [...] Denali CTP 3 – How to Set Up the FileTable and access the unstructured data from SQL Server [...]

  3. Hi,

    I played and tested this a bit and i found one big disadvantage in this feature.
    While copying a file, you cannot execute any queries on the DB!
    What happens if you constantly need to copy\delete\modify files?

    Can it really be that no reads are available or am I missing something ?

    If so, this is a major issue and should be pointed out clearly as a limitation.

    Would be happy to hear what you think about it.

    Matan

    Matan

    February 27, 2012 at 1:51 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 185 other followers