Collect Application Information
From ISP_RAS
Contents |
Prerequisites
The following components should be installed in the system in order to run apptodb.pl script:
- perl >= 5.8.8
- python >= 2.4.2
- 'readelf' comamnd from either binutils or elfutils
- file
- coreutils (actually, 'uname' command may be required in order to guess architecture)
- in order to collect data using rpm files, 'rpm', 'rpm2cpio' and 'cpio' commands are required
- in order to collect data using deb files, 'dpkg-deb', 'lzma', 'tar', 'gzip' and 'bzip2' commands are required
Note that apptodb.pl uses componenttodb.pl script.
Getting apptodb.pl
apptodb.pl with all attendants is available after LSB ATK Manager installation in '/opt/lsb/app-testkit/manager/utils' directory. The latest version is available in the unofficial bzr repository. You may also contact ISP RAS team in order to get the latest versions of scripts and any support information.
apptodb.pl - Get Application Data
Usage: apptodb.pl [options] app_name
app_name Name of the application. Can be absent if single rpm or deb file is processed.
-a|--arch Specify architecture. Possible values are IA32, IA64, PPC32,
PPC64, S390, S390X and x86-64. By default the output of `uname -m`
is analyzied
-h|--help Display this help message
-v|--version app_version Application version. If none is specified, the script will try
to obtain it from package manager
-i|--infile filename File with the list of packages to be processed (i.e. packages
that form the application). If no file is specified, application
name is considered to be the only package
-l|--filelist filename File containing list of file names that are installed by the
application. (Directory names are also allowed; all files from
given directories will be processed)
-r|--rpm filename RPM file with application to be processed. Use this option if you don't want to
install the application and the whole application is provided by single rpm file.
NOTE: RPM file can be given as script argument, without these options. In this case
application name will be taken from RPM information.
--rpmlist filename List of RPM files containing application binaries. Use this option if you don't want to
install the application and the application is provided by several rpm files.
-d|--deb filename Deb file with application to be processed. Use this option if you don't want to
install the application and the whole application is provided by a single deb file.
NOTE: deb file can be given as script argument, without these options. In this case
application name will be taken from deb information.
--deblist filename List of Deb files containing application binaries. Use this option if you don't want to
install the application and the application is provided by several deb files.
-o|--outfile filename Filename where the data for upload will be situated.
By default, "<app_name>_<app_ver>_<arch>_upload_data" is used
-q|--quiet Be relly quiet - do not produce any messages except critcal errors
--verbose Be verbose
-u|--url URL Application homepage URL
-t|--licensetype type License type. "Open Source" and "Proprietary" are valid values at the moment.
-p|--packager packager Application packager - any information about uploaded application build
(distribution, maintainer, etc.). This can be useful since different builds of the same
application may require different sets of external interfaces
-m|--packagemanager pkg_name Package manager used in the distribution. Supported values
are 'rpm', 'dpkg' and 'emerge'. If none is specified, the script
will try to guess it
--ui ui_kind User interface style. Possible values are 'Command Line' and 'GUI'
--category category_kind Application functional category. Possible values are:
'Accessibility and i18n'
'Antivirus and Secutity'
'Data Management'
'Development'
'Emulators'
'Games'
'Multimedia and Graphics'
'Network'
'Office and Desktop'
'Science and Education'
'System Tools'
'X11 Utilities'
-s|--summary summary_text Application summary. If none is specified, the script will try to obtain it from
package manager; if rpm or deb file is being processed, the script will try to
obtain necessary data from the file itself
-c|--comment comment_text Comment for the application. Any useful information about application build process
can be added here (e.g. which libraries were linked statically).
Collects information about application. There are three ways of collecting this information: - using system package manager; - using rpm file(s) containing application binary files; - using deb file(s) containing application binary files; - directly specifiyng files installed by the application (see detailed information below).
Application version is obtained using package manager, but can be specified directly in command line.
As a result, apptodb.pl produces file containing data about application (file format is described below). The essential part of this file is readelf output. In order to observe general information about application (library names, command names, etc.) we recommend to grep output file for '!' symbol.
Collect Information Using Package Manager
Application should be installed using system package manager (currently rpm, dpkg and emerge are supported). If application consists only of one package whose name is the same as application's one, then apptodb.pl should be provided only with the application name (see usage above). Otherwise one should provide apptodb.pl with file containing list of packages application consists of (simple text file, one package name per line). Actually, for each application package apptodb.pl calls componnettodb.pl and then joins the data obtained.
Using system package manager apptodb.pl gets list of files installed by applications, analyzes format of these files (using 'file' command) and for shared objects and ELF executables calls 'readelf' command.
Examples
Collect information about 'opera'. Here the suggestion is made that 'opera' application is installed in the system using system package manager and the appropriate package is also named 'opera'.
apptodb.pl opera
Let's now add MySQL Server application installed in OpenSUSE system by two packages - 'mysql' and 'mysql-shared'. In order to do this, we should create file 'mysql_packages' with the following lines:
mysql mysql-shared
and then run apptodb.pl:
apptodb.pl -i mysql_packages mysql
Collect Information Using RPM File(s)
If it is undesirable to install the application in the system and the application is provided by one or more rpm files, then such rpm files can be directly processed by the script.
If the whole application is provided by a single rpm file, then this rpm file can be simply given as apptodb.pl argument. The application name will be taken from rpm information (if you want to name you application manually, specify the rpm file using '-r' or '--rpm' option, and give the desired name to apptodb.pl as argument). If the application consists of more than one rpm file, then a file containing list of rpms should be created and its name should be passed to the script using '--rpmlist' option.
When apptodb processes rpm files, they are decompressed into unique directoy inside /var/tmp/ starting with 'apptodb' using 'rpm2cpio' and then their contents will be analyzed (as if '-l' option is used). If everything is ok, the directory will be removed.
Note: Application name can be omitted when single rpm or deb file is processed. In this case application name will be obtained from the file. Url to application homapage and summary will be also obtained from rpm file, if they are not specified in command line.
Examples
Collect information about 'jre'. All necessary information (application version, packager, etc.) not specified in command line will be obtained from rpm file.
apptodb.pl -r jre-6u1-linux-amd64.rpm jre
Let's now add MySQL Server application installed in OpenSUSE 10.2 system by two rpms - 'mysql-5.0.26-12.x86_64.rpm' and 'mysql-shared-5.0.26-12.x86_64.rpm'. In order to do this, we should create file 'mysql_rpms' with the following lines:
mysql-5.0.26-12.x86_64.rpm mysql-shared-5.0.26-12.x86_64.rpm
and then run apptodb.pl:
apptodb.pl --rpmlist mysql_rpms -v "5.0.26" -u "http://www.mysql.com" -c "A True Multiuser, Multithreaded SQL Database Server" mysql
Note that here we had to specify all package properties (version, comment and url), since the script will not guess it if several rpm are processed.
Collect Information Using Deb File(s)
All things are the same as for rpm files, but you should use '-d' instead of '-r' and '--deblist' instead of '--rpmlist' options.
Collect Information Using File List
If you don't want to install your application using system package manager, or if your installation is quite specific and package manager doesn't have all information about files installed (for example, if you install your application simply extracting archive in the given directory), it is possible to provide apptodb.pl with text file containing list of files installed by the application using '-l' option (simple text file, one file name per line; see also usage below). Note that you may specify not only separate files, but directory names - in this case all files in this directory (and all its subdirectories) will be processed.
Example
Let's suggest that we have manually installed application 'foo' in the '/opt/foo' directory and one library from this application, 'foo.so', is installed in the '/usr/lib' directory. In order to collect information about 'foo' application, one need to create file, let's say, 'foo_files', with the following lines:
/opt/foo /usr/lib/foo.so
and then call apptodb.pl, providing it with application version and summary (this is optional, but very useful):
apptodb.pl -l foo_files -v 1.0 -c "Foo application" foo
Format of File Produced by apptodb.pl
File starts with the following string:
!Application '<name>' '<version>' '<arch>' '<summary>'
Then some additional application information is printed in one line:
!AppCategory '<license_type>' '<ui_kind>' '<functional_category>'
followed by components sections. Each component section starts with the following string:
!Component '<name>' '<version>' '<package>' '<arch>'
and contains descriptions of libraries (i.e. shared objects), commands (i.e. ELF executables) and executable scripts of the component.
Note that component's parameters (name, version, package, arch) are not actually used now.
Description of a library (shared object) starts with the following string:
!Library '<name>' '<version>' '<runname>' '<full_path>' '<ABI_tag>' '<soname>'
Then readelf output for this library is printed. <ABI_tag> and <soname> are optional and may be absent.
Description of a command starts with the following string:
!Command '<name>' '<full_path>'
Then readelf output for this command is printed.
Description of a scripts contains only one string:
!Script '<name>' '<full_path>' '<kind>'
Python and Perl scripts and modules are processed separately. For every perl module the following record is dumped:
!PythonModule '<name>' '<full_path>' '<type_reported_by_file>'
For perl module:
!PerlModuleProvided '<name>' '<full_path>'
The formats differs slightly, because there is no reliable method to detect python scripts. File extension seems to be the most reliable sign, but we need some more info for cases when we doubt if this is python or not.
For perl, 'file' command is much more reliable (though some surprises happens there, too).
For every python/perl script or module, the records representing its dependencies and (by '#!') interpreter requested are also dumped:
PythonInterpreter: <'#!'-interpreter> PythonModuleUsed: <name>
PerlInterpreter: <'#!'-interpreter> PerlModuleUsed: <used>
ATK Manager
Information about applications can be collected using LSB ATK Manager, which provides web interface to 'apptodb.pl' script (see ATK Manager Getting Started).