Initial import

author: Lev Levitsky 2015-06-14 15:55:04 +0300
committer: Lev Levitsky 2015-06-14 15:55:04 +0300
commit: 72a36dec00eb5fc9065856f35905baa4b1027776 (patch)
tree: 13bef12f918f40254e23f8567bd8da7f64f1e2b2
download: aur-72a36dec00eb5fc9065856f35905baa4b1027776.tar.gz
3 files changed, 586 insertions, 0 deletions
diff --git a/.SRCINFO b/.SRCINFO
new file mode 100644
index 000000000000..7af7e9639d9a
--- /dev/null
+++ b/.SRCINFO
@@ -0,0 +1,20 @@
+# Generated by makepkg 4.2.1
+# Wed Apr 15 22:17:40 UTC 2015
+pkgbase = python2-pyteomics
+	pkgdesc = A framework for proteomics data analysis.
+	pkgver = 3.0.1
+	pkgrel = 1
+	url = http://pythonhosted.org/pyteomics
+	changelog = CHANGELOG
+	arch = any
+	license = Apache
+	depends = python2
+	optdepends = python2-matplotlib: for pylab_aux module, optional
+	optdepends = python2-lxml: for XML parsing modules, recommended
+	optdepends = python2-numpy: for lots of features, highly recommended
+	options = !emptydirs
+	source = https://pypi.python.org/packages/source/p/pyteomics/pyteomics-3.0.1.tar.gz
+	md5sums = 2c838cc1c16dce69148662b883c755b9
+
+pkgname = python2-pyteomics
+
diff --git a/CHANGELOG b/CHANGELOG
new file mode 100644
index 000000000000..a9f2bb997e99
--- /dev/null
+++ b/CHANGELOG
@@ -0,0 +1,544 @@
+3.0.1
+-----
+
+ - Added `legend_kwargs` as a keyword argument to
+   :py:func:`pyteomics.pylab_aux.scatter_trend`.
+
+ - Minor fixes.
+
+3.0.0
+-----
+ - XML parsers are now implemented as objects, each format has its own class.
+   Those classes can be instantiated using the same arguments as :py:func:`read`
+   functions accepted, and support direct iteration and the ``with`` syntax.
+   The :py:func:`read` functions are now simple aliases to the corresponding
+   constructors.
+
+ - As a result, functions :py:func:`iterfind`, :py:func:`version_info` and
+   :py:func:`get_by_id` functions are now deprecated in favor of methods
+   :py:meth:`iterfind` and :py:meth:`get_by_id` and attribute
+   :py:attr:`version_info` of corresponding instances.
+
+ - In :py:func:`pyteomics.mgf.write`, the order of keys and the format of values
+   are now controlled via module-level variables.
+
+ - In :py:mod:`pyteomics.electrochem`, correction for pK of terminal groups
+   depending on the terminal residue is implemented; example set of pK and
+   corrected pK added.
+
+ - Imports of external dependencies are delayed where possible, so that
+   unnecessary :py:exc:`ImportErrors` do not occur.
+
+ - :py:func:`local_fdr` renamed to :py:func:`qvalues` in :py:mod:`pepxml`,
+   :py:mod:`mzid`, :py:mod:`tandem` and :py:mod:`auxiliary`.
+   :py:func:`local_fdr` did not reflect the semantics of the function.
+   The algorithm has been also corrected so that the array of q-values
+   is always sorted (as it should be by definition).
+
+ - :py:func:`qvalues` now also accepts a parameter `full_output` which keeps the
+   PSMs alongside their scores and associated q-values.
+
+ - All :py:func:`fdr`, :py:func:`qvalues`, and :py:func:`!filter` functions
+   now accept a new parameter `correction`. It is used for more accurate
+   estimation of the number of false positives using TDA (paper with explanation
+   submitted to JPR).
+
+ - :py:func:`!filter` functions now support both iterator protocol and context
+   manager protocol. They now also accept the `full_output` parameter, which has
+   the following meaning: if :py:const:`True` (default), then an array of PSMs
+   is directly returned by the function. Otherwise, an iterator is returned, as
+   before. The array takes some memory, but this way is usually around 2x faster.
+
+ - New function :py:func:`pyteomics.pylab_aux.plot_qvalue_curve`.
+
+ - :py:class:`pyteomics.mass.Composition` objects now have a :py:meth:`mass`
+   method (equivalent to :py:func:`pyteomics.mass.calculate_mass`.
+
+ - Also, :py:class:`Composition` and objects returned by
+   :py:func:`pyteomics.parser.amino_acid_composition` now inherit from
+   :py:class:`collections.defaultdict` **and** :py:class:`collections.Counter`.
+
+ - Decoy-related functions in :py:mod:`pyteomics.fasta` now accept a new parameter
+   `keep_nterm` that preserves the N-terminal residue in the generated decoy
+   sequences.
+
+ - Minor fixes.
+
+API changes
+...........
+
+ - In :py:func:`pyteomics.pylab_aux.scatter_trend`, keyword arguments for
+   :py:func:`pylab.scatter` and :py:func:`pylab.plot` are now accepted as dicts
+   `scatter_kwargs` and `plot_kwargs`. Keyword argument `alpha` is now not
+   accepted and should be put in the appropriate dict.
+ - In :py:func:`pyteomics.pylab_aux.plot_function_3d` and
+   :py:func:`pyteomics.pylab_aux.plot_function_contour`, arbitrary kwargs can
+   now also be passed to the plotting function.
+ - :py:func:`!filter` functions do not support context manager protocol by
+   default. To keep using them as iterators / context managers, specify
+   ``full_output=False`` (see above for details).
+
+2.5.5
+-----
+
+Fix for a memory leak in :py:func:`pyteomics.mzid.get_by_id`, which affects
+:py:func:`pyteomics.mzid.read` with ``retrieve_refs=True``.
+
+2.5.4
+-----
+
+ - New functions :py:func:`local_fdr` in :py:mod:`pepxml`, :py:mod:`mzid`, and
+   :py:mod:`tandem`. The function returns a NumPy array with PSM scores and
+   corresponding values of local FDR.
+
+ - New parameter `iterative` in :py:func:`read` functions of XML parsing
+   modules. Parsing of mzIdentML files with ``retrieve_refs=True`` got
+   significantly faster.
+
+2.5.3
+-----
+
+ - Universally applicable modifications are now allowed in
+   :py:func:`pyteomics.parser.isoforms`.
+ - It is now also possible to specify non-terminal modifications which are
+   only applicable to terminal residues.
+ - Fix in :py:func:`pyteomics.parser.parse`: if the `labels` argument is
+   provided, it needs to contain standard terminal groups if they are present
+   in the sequence or if `show_unmodified_termini` is set to :py:const:`True`.
+ - :py:class:`pyteomics.mass.Composition` instances are now pickleable.
+ - Performance improvements.
+
+2.5.2
+-----
+
+ - New parameter `reverse` in all :py:func:`!filter` functions.
+ - New function :py:func:`pyteomics.mass.fast_mass2`, which is analogous to
+   :py:func:`pyteomicsmass.fast_mass`, but supports full *modX* notation and
+   is several times slower.
+ - Fix in :py:func:`pyteomics.pepxml.read` for compatibility with files
+   produced with Mascot2XML utility.
+ - Unknown labels now allowed in :py:mod:`pyteomics.electrochem` and
+   :py:mod:`pyteomics.achrom` functions in accordance with new general policy.
+
+2.5.1
+-----
+
+ - Bugfixes in :py:func:`pyteomics.parser.isoforms`:
+  - handling of the `labels` argument is now in accordance with new policy
+  - solved memory problems when using `max_mods`
+ - :py:func:`pyteomics.parser.cleave` does not require a valid *modX* sequence
+   by default.
+
+2.5.0
+-----
+
+ - :py:func:`pyteomics.parser.amino_acid_composition` now accepts "split"
+   parsed sequences.
+
+ - Cleavage rules in :py:data:`pyteomics.parser.expasy_rules` updated.
+
+ - Helper function :py:func:`pyteomics.parser.num_sites` counts the number
+   of cleavage sites in a sequence.
+
+ - Helper function :py:func:`pyteomics.parser.match_modX` does essentially
+   the same as :py:func:`pyteomics.parser.is_modX`, but returns a
+   :py:class:`re.match` object or :py:const:`None` instead of a :py:class:`bool`.
+
+ - Bugfix in :py:func:`pyteomics.auxiliary.filter`, which didn't work correctly
+   with iterators.
+
+ - Added a new parameter ``max_mods`` in :py:func:`pyteomics.parser.isoforms`.
+
+API changes
+...........
+
+ - The boolean ``overlap`` parameter in :py:func:`pyteomics.parser.cleave` is
+   replaced with an integer ``min_length``. Since ``min_length`` uses
+   :py:func:`pyteomics.parser.length`, the ``labels`` keyword argument is now
+   accepted by :py:func:`cleave` and :py:func:`num_sites`, if needed. With
+   carefully designed cleavage rules, all cleavage functions work
+   with *modX* sequences.
+
+ - The ``labels`` argument in :py:func:`pyteomics.parser.parse` and related
+   functions has changed its meaning. :py:func:`parse` won't raise an exception
+   for non-standard labels in sequences if the ``labels`` keyword argument is
+   not given.
+
+ - The *modX* notation specification is now more strict to avoid ambiguity:
+   only zero or two terminal groups can be present in a *modX* sequence.
+   Sequences with one terminal group specified will be supported where possible,
+   but be advised that sequences such as "H-OH" are intrinsically ambiguous.
+
+2.4.3
+-----
+
+ - Added the ``ratio`` keyword argument for FDR calculation.
+
+ - Minor changes in :py:func:`iterfind` functions of file parsers.
+
+ - Bugfix in :py:func:`pyteomics.mgf.write` (duplication of pepmass key).
+
+ - Removed non-functional parameter ``read_schema`` for
+   :py:func:`pyteomics.tandem.read`.
+
+2.4.2
+-----
+
+ - Bugfix in :py:func:`pyteomics.mass.most_probable_isotopic_composition`.
+   The bug manifested itself after version **2.4.0**, when
+   :py:data:`pyteomics.mass.nist_mass` was expanded. Also, the format of the
+   returned value is now in accordance with the documentation.
+
+2.4.1
+-----
+
+ - New function :py:func:`pyteomics.auxiliary.filter` for filtering lists
+   of PSMs not coming directly from files in supported formats.
+
+ - Also, a format-agnostic helper function :py:func:`pyteomics.auxiliary.fdr`.
+
+2.4.0
+-----
+
+ - New functions for filtering to a certain FDR level based on target-decoy
+   strategy, as well as for FDR estimation, in :py:mod:`pyteomics.tandem`,
+   :py:mod:`pyteomics.pepxml` and :py:mod:`pyteomics.mzid`. The functions are
+   called :py:func:`!filter` (beware of shadowing the built-in function) and
+   :py:func:`fdr` (in each of the modules). Chained versions
+   :py:func:`filter.chain` and :py:func:`filter.chain.from_iterable` are
+   also available. See `Data Access <data.html#general-notes>`_ for more info.
+
+ - New function :py:func:`pyteomics.parser.coverage` for sequence coverage
+   calculation.
+
+ - New function :py:func:`pyteomics.fasta.decoy_chain`, a chained version of
+   :py:func:`pyteomics.fasta.decoy_db`.
+
+ - New elements in :py:data:`pyteomics.mass.nist_mass`. Pretty much all elements
+   are there now.
+
+ - Fix in :py:func:`pyteomics.parser.parse` to cover some fancy corner cases.
+
+ - Bugfix in :py:mod:`pyteomics.tandem`: modification info is now fully extracted.
+
+ - :py:func:`pyteomics.mass.isotopic_composition_abundance` is now able to
+   calculate abundances for larger molecules.
+
+   .. note::
+       Rounding errors may be significant in this case.
+
+2.3.0
+-----
+
+ - New parameter "read_schema" in :py:func:`read` functions of XML parsing modules.
+   When set to :py:const:`False`, disables the attempts to fetch an auxiliary file
+   and obtain structure information about the file being parsed.
+
+ - New function :py:func:`chain` in all modules that have a :py:func:`read`
+   function, for convenient chaining of multiple files. :py:func:`chain` only
+   works as a context manager. Use :py:func:`itertools.chain` in other cases.
+   The ``chain.from_iterable`` form is also available as a context manager.
+
+ - New function :py:func:`pyteomics.auxiliary.print_tree` for exploration of
+   complex nested dicts produced by XML parsers.
+
+ - New sets of retention coefficients in :py:mod:`pyteomics.achrom`.
+
+ - Bugfix in :py:mod:`pyteomics.pepxml`. The bug caused an exception when parsing
+   some pepXML files.
+
+ - The output of :py:func:`pyteomics.mgf.read` now always contains a masked
+   array of charges.
+
+ - Other minor fixes.
+
+API change
+..........
+
+ - In :py:func:`pyteomics.mgf.read` the precursor charge is now always represented
+   by a list of ints (a :py:class:`ChargeList` object).
+
+2.2.2
+-----
+
+ - Bugfix in :py:mod:`pyteomics.tandem`. The info about all proteins is now
+   extracted.
+
+2.2.1
+-----
+
+ - Update parsers for FASTA headers.
+
+ - NamedTuple for FASTA entries is now defined globally, which should solve
+   pickling problems.
+
+2.2.0
+-----
+
+ - New module :py:mod:`pyteomics.tandem` for reading output files of X!Tandem
+   search engine.
+
+2.1.6
+-----
+
+ - Fix in :py:mod:`pyteomics.pepxml`. pepXML files generated by TPP are now
+   processed without errors.
+
+
+2.1.5
+-----
+
+ - Fix in :py:mod:`pyteomics.pepxml`. 'modified_peptide' is now always available.
+
+ - Fix in :py:mod:`pyteomics.mass` (issue #2 in the bug tracker).
+
+ - Improved arithmetics for :py:class:`Composition` objects.
+
+2.1.4
+-----
+
+ - In :py:mod:`fasta`, :py:func:`decoy_db` now doesn't write to file, but returns
+   an iterator over FASTA records. The old :py:func:`decoy_db` is now called
+   :py:func:`write_decoy_db`, which is equivalent to :py:func:`decoy_db` combined
+   with :py:func:`write`.
+
+Bugfixes:
+
+ - In :py:func:`pyteomics.mgf.read`, the charges, if present, are returned as a
+   masked array now. Previously, an exception occurred if charges were missing
+   for some of the fragments.
+
+ - Values in :py:data:`mass.nist_mass` corrected.
+
+ - Other minor corrections.
+
+2.1.3
+-----
+
+ - Adjust the behavior affected by the bug fixed in 2.1.2. `name` attributes
+   of `<cvParam>` elements in the absence of `value` attributes are now collected
+   in a list under the `'name'` key.
+
+ - Add support for overlapping matches in :py:func:`parser.cleave`.
+
+2.1.2
+-----
+
+ - Bugfix in XML parsers. The bug caused the mzML parser break on some files.
+   The fix can slightly change the format of the output.
+
+2.1.1
+-----
+
+ - Rename keys in the dicts returned by :py:func:`mgf.read` to facilitate
+   writing code working with both MGF and mzML.
+
+ - The items yielded by :py:func:`fasta.read` now have attributes `description`
+   and `sequence`.
+
+2.1.0
+-----
+
+ - New sets of retention coefficients in :py:mod:`achrom`.
+
+ - :py:class:`mass.Composition` now only stores non-zero ints.
+
+ - :py:mod:`fasta` now has tools for parsing of FASTA headers.
+
+ - File parsers now implement the `context manager protocol
+   <http://docs.python.org/reference/datamodel.html#with-statement-context-managers>`_.
+   We recommend using `with` statements to avoid resource leaks.
+
+API changes
+...........
+
+ - 'pepmass' is now a tuple in the output of :py:func:`mgf.read` (to allow
+   reading precursor intensities).
+
+ - new function :py:func:`fasta.parse` for convenient parsing of FASTA headers.
+
+ - :py:data:`fasta.std_parsers` stores parsers for common UniProt header formats.
+
+ - new parameter *parser* in :py:func:`fasta.read` allows to apply parsing while
+   reading a FASTA file.
+
+ - `close` parameter removed in all functions that do file I/O. The unified
+   behavior is: if the parameter is a file object, it won't be closed by the
+   function. If a file path is given, the file object will be created and closed
+   inside the corresponding function.
+
+2.0.3
+-----
+
+ - Added new class :py:class:`pyteomics.mass.Unimod`. The interface is
+   experimental and may change.
+
+ - Improved :py:func:`iterfind` function in XML-reading modules.
+
+ - :py:class:`pyteomics.mass.Composition` objects now support multiplication by
+   :py:class:`int`.
+
+ - Bugfix in :py:func:`auxiliary.linear_regression`.
+
+2.0.2
+-----
+
+ - Added new function :py:func:`iterfind` in :py:mod:`pyteomics.mzid`,
+   :py:mod:`pyteomics.pepxml` and :py:mod:`pyteomics.mzml`.
+
+2.0.1
+-----
+
+API changes
+...........
+
+ - :py:func:`pyteomics.parser.peptide_length` is renamed to
+   :py:func:`pyteomics.parser.length`.
+
+2.0.0
+-----
+
+ - Added :py:mod:`mzid` module for parsing of mzIdentML files.
+
+ - Fixed bugs, improved tests.
+
+API changes
+...........
+
+ - top-module functions in :py:mod:`fasta`, :py:mod:`mgf`, :py:mod:`mzml`,
+   :py:mod:`pepxml`, as well as :py:mod:`mzid`, are now called :py:func:`read`.
+
+ - in :py:mod:`parser`, :py:func:`parse_sequence` renamed to :py:func:`parse`.
+   It now accepts an optional parameter `allow_unknown_modifications`.
+
+ - :py:func:`mgf.write_mgf` and :py:func:`fasta.write_fasta` renamed to
+   :py:func:`write`.
+
+ - the output format of all :py:func:`read` functions has changed.
+
+1.2.5
+-----
+
+ - Include Apache license version 2.0:
+   http://www.opensource.org/licenses/Apache-2.0
+
+ - Minor bugfix in :py:mod:`pyteomics.fasta`.
+
+1.2.4
+-----
+
+ - Changes in :py:mod:`pyteomics.mass`.
+
+API changes
+...........
+
+ - :py:class:`Composition` objects can be created using positional first
+   argument, which will be treated as a sequence or (upon failure) as a formula.
+   This means that all functions relying on Composition
+   (:py:func:`calculate_mass`, :py:func:`most_probable_isotopic_composition`,
+   :py:func:`isotopic_composition_abundance`) allow that as well. However, it's
+   of no use for the latter.
+
+ - :py:class:`Composition` entries for modifications can be added to *aa_comp*
+   and used in composition and mass calculations. This way the specified group
+   will be added to any residue bearing this modification.
+
+ - That being said, the :py:func:`add_modifications` function is not needed
+   anymore and has been removed.
+
+ - Addition and subtraction of :py:class:`Composition` objects now produces a
+   :py:class:`Composition` object, allowing addition/subtraction of multiple
+   objects.
+
+ - :py:class:`Composition` is now a subclass of
+   :py:class:`collections.defaultdict` so one can safely retrieve values
+   without checking if a key exists.
+
+1.2.3
+-----
+
+ - :py:func:`pyteomics.parser.isoforms` now allows terminal modifications.
+
+ - Bugfixes in :py:func:`pyteomics.parser.parse_sequence`.
+
+ - New function :py:func:`pyteomics.parser.tostring` converts parsed sequences
+   to strings.
+
+ - Helper function :py:func:`pyteomics.parser.is_modX` added to check *modX* labels.
+
+API changes
+...........
+
+ - :py:func:`pyteomics.parser.isoforms` now returns a generator object
+
+1.2.2
+-----
+
+ - Bugfix in :py:mod:`pyteomics.pepxml`: modification info is now extracted.
+ - New optional bool argument 'split' in :py:func:`pyteomics.parser.parse_sequence()`
+   allows to generate a list of tuples where modifications are separated from the
+   residues instead of a regular list of labels. In *labels* not only *modX* labels
+   are now allowed, but also separate *mod* prefixes. Such modifications are
+   assumed to be applicable to any residue.
+
+
+1.2.1
+-----
+
+ - Memory usage **significantly** decreased when parsing large mzML and pepXML
+   files.
+
+1.2.0
+-----
+
+ - Added support for Python 3. Python 2.7 is still supported, Python 2.6 is not.
+
+1.1.1
+-----
+
+ - New function called :py:func:`add_modifications()` added in
+   :py:mod:`pyteomics.mass`. It updates *aa_comp*.
+ - Also, :py:func:`pyteomics.parser.isoforms` is a new function to get
+   all possible modified sequences of a peptide.
+
+1.1.0
+-----
+
+ - New module added - :py:mod:`pyteomics.mgf`. It is intended for reading and
+   writing files in Mascot Generic Format.
+
+1.0.2
+-----
+
+ - In :py:mod:`pyteomics.pepxml` module, now all search hits are read from file
+   (not only the top hit).
+
+API changes:
+............
+
+ - :py:func:`pyteomics.pepxml.read`: information specific to search hits is now
+   stored in a list under the ``'search_hits'`` key. The list is sorted by hit
+   rank.
+
+
+1.0.1
+-----
+
+ - Fix compatibility issues in :py:mod:`pyteomics.pepxml` module.
+
+1.0.0
+-----
+
+ - The first public release of Pyteomics.
+
+API changes:
+............
+
+ - :py:mod:`pyteomics.achrom`: rename ``'length correction factor'`` to
+   ``'length correction parameter'``.
+
+   - :py:func:`pyteomics.achrom.get_RCs_vary_lcf` was renamed to
+     :py:func:`pyteomics.achrom.get_RCs_vary_lcp`.
+   - `length_correction_factor` keyword argument of
+     :py:func:`pyteomics.achrom.get_RCs` was renamed to `lcp`.
+
diff --git a/PKGBUILD b/PKGBUILD
new file mode 100644
index 000000000000..811249bb4941
--- /dev/null
+++ b/PKGBUILD
@@ -0,0 +1,22 @@
+# Maintainer: Lev Levitsky <levlev at mail dot ru>
+pkgname=python2-pyteomics
+pkgver=3.0.1
+pkgrel=1
+pkgdesc="A framework for proteomics data analysis."
+arch=('any')
+url="http://pythonhosted.org/pyteomics"
+license=('Apache')
+depends=('python2')
+optdepends=('python2-matplotlib: for pylab_aux module, optional' \
+            'python2-lxml: for XML parsing modules, recommended' \
+            'python2-numpy: for lots of features, highly recommended')
+options=(!emptydirs)
+source=("https://pypi.python.org/packages/source/p/pyteomics/pyteomics-${pkgver}.tar.gz")
+md5sums=('2c838cc1c16dce69148662b883c755b9')
+changelog="CHANGELOG"
+package() {
+  cd "${srcdir}/pyteomics-${pkgver}"
+  python2 setup.py install --root="$pkgdir/" --optimize=1
+}
+
+# vim:set ts=2 sw=2 et:
author	Lev Levitsky	2015-06-14 15:55:04 +0300
committer	Lev Levitsky	2015-06-14 15:55:04 +0300
commit	72a36dec00eb5fc9065856f35905baa4b1027776 (patch)
tree	13bef12f918f40254e23f8567bd8da7f64f1e2b2
download	aur-72a36dec00eb5fc9065856f35905baa4b1027776.tar.gz