|  |  |  | Tracker Extract Library Reference Manual |  | 
|---|---|---|---|---|
| Top | Description | ||||
#include <libtracker-extract/tracker-extract.h> gchar* tracker_coalesce (gint n_values,...); const gchar* tracker_coalesce_strip (gint n_values,...); gchar* tracker_merge (const gchar *delimiter,gint n_values,...); gchar* tracker_merge_const (const gchar *delimiter,gint n_values,...); gssize tracker_getline (gchar **lineptr,gsize *n,FILE *stream); gchar* tracker_text_normalize (const gchar *text,guint max_words,guint *n_words); gboolean tracker_text_validate_utf8 (const gchar *text,gsize text_len,GString **str,gsize *valid_len); gchar* tracker_date_format_to_iso8601 (const gchar *date_string,const gchar *format); gchar* tracker_date_guess (const gchar *date_string);
This API is provided to facilitate common more general functions which extractors may find useful. These functions are also used by the in-house extractors quite frequently.
gchar* tracker_coalesce (gint n_values,...);
tracker_coalesce has been deprecated since version 1.0 and should not be used in newly-written code. Use tracker_coalesce_strip() instead.
This function iterates through a series of string pointers passed
using Varargs and returns the first which is not NULL, not empty
(i.e. "") and not comprised of one or more spaces (i.e. " ").
The returned value is stripped using g_strstrip(). All other values
supplied are freed. It is MOST important NOT to pass constant
string pointers to this function!
| 
 | the number of Varargssupplied | 
| 
 | the string pointers to coalesce | 
| Returns : | the first string pointer from those provided which
matches, otherwise NULL. | 
Since 0.8
const gchar* tracker_coalesce_strip (gint n_values,...);
This function iterates through a series of string pointers passed
using Varargs and returns the first which is not NULL, not empty
(i.e. "") and not comprised of one or more spaces (i.e. " ").
The returned value is stripped using g_strstrip(). It is MOST
important NOT to pass constant string pointers to this function!
| 
 | the number of Varargssupplied | 
| 
 | the string pointers to coalesce | 
| Returns : | the first string pointer from those provided which
matches, otherwise NULL. | 
Since 0.9
gchar* tracker_merge (const gchar *delimiter,gint n_values,...);
tracker_merge has been deprecated since version 1.0 and should not be used in newly-written code. Use tracker_merge_const() instead.
This function iterates through a series of string pointers passed
using Varargs and returns a newly allocated string of the merged
strings. All passed strings are freed (don't pass const values)/
The delimiter can be NULL. If specified, it will be used in
between each merged string in the result.
| 
 | the delimiter to use when merging | 
| 
 | the number of Varargssupplied | 
| 
 | the string pointers to merge | 
| Returns : | a newly-allocated string holding the result which should
be freed with g_free()when finished with, otherwiseNULL. | 
Since 0.8
gchar* tracker_merge_const (const gchar *delimiter,gint n_values,...);
This function iterates through a series of string pointers passed
using Varargs and returns a newly allocated string of the merged
strings.
The delimiter can be NULL. If specified, it will be used in
between each merged string in the result.
| 
 | the delimiter to use when merging | 
| 
 | the number of Varargssupplied | 
| 
 | the string pointers to merge | 
| Returns : | a newly-allocated string holding the result which should
be freed with g_free()when finished with, otherwiseNULL. | 
Since 0.9
gssize tracker_getline (gchar **lineptr,gsize *n,FILE *stream);
Reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null-terminated and includes the newline character, if one was found.
Read GNU getline()'s manpage for more information
| 
 | Buffer to write into | 
| 
 | Max bytes of linebuf | 
| 
 | Filestream to read from | 
| Returns : | the number of characters read, including the delimiter
character, but not including the terminating NULLbyte. This value
can be used to handle embeddedNULLbytes in the line read. Upon
failure, -1 is returned. | 
Since 0.9
gchar* tracker_text_normalize (const gchar *text,guint max_words,guint *n_words);
tracker_text_normalize has been deprecated since version 1.0 and should not be used in newly-written code. Use tracker_text_validate_utf8() instead.
This function iterates through text checking for UTF-8 validity
using g_utf8_get_char_validated(). For each character found, the
GUnicodeType is checked to make sure it is one fo the following
values:
All other symbols, punctuation, marks, numbers and separators are stripped. A regular space (i.e. " ") is used to separate the words in the returned string.
The n_words can be NULL. If specified, it will be populated with
the number of words that were normalized in the result.
| 
 | the text to normalize | 
| 
 | the maximum words of textto normalize | 
| 
 | the number of words actually normalized | 
| Returns : | a newly-allocated string holding the result which should
be freed with g_free()when finished with, otherwiseNULL. | 
Since 0.8
gboolean tracker_text_validate_utf8 (const gchar *text,gsize text_len,GString **str,gsize *valid_len);
This function iterates through text checking for UTF-8 validity
using g_utf8_validate(), appends the first chunk of valid characters
to str, and gives the number of valid UTF-8 bytes in valid_len.
| 
 | the text to validate | 
| 
 | length of text, or -1 if NUL-terminated | 
| 
 | the string where to place the validated UTF-8 characters, or NULLif
 not needed. | 
| 
 | Output number of valid UTF-8 bytes found, or NULLif not needed | 
| Returns : | TRUEif some bytes were found to be valid,FALSEotherwise. | 
Since 0.9
gchar* tracker_date_format_to_iso8601 (const gchar *date_string,const gchar *format);
This function uses strptime() to create a time tm structure using
date_string and format.
| 
 | the date in a string pointer | 
| 
 | the format of the date_string | 
| Returns : | a newly-allocated string with the time represented in
ISO8601 date format which should be freed with g_free()when
finished with, otherwiseNULL. | 
Since 0.8
gchar*              tracker_date_guess                  (const gchar *date_string);
This function uses a number of methods to try and guess the date
held in date_string. The date_string must be at least 5
characters in length or longer for any guessing to be attempted.
Some of the string formats guessed include:
"YYYY-MM-DD" (Simple format)
"20050315113224-08'00'" (PDF format)
"20050216111533Z" (PDF format)
"Mon Feb 9 10:10:00 2004" (Microsoft Office format)
"2005:04:29 14:56:54" (Exif format)
"YYYY-MM-DDThh:mm:ss.ff+zz:zz
| 
 | the date in a string pointer | 
| Returns : | a newly-allocated string with the time represented in
ISO8601 date format which should be freed with g_free()when
finished with, otherwiseNULL. | 
Since 0.8