Skip to content

Commit

Permalink
Merge pull request #118 from goodmami/v0.7.0
Browse files Browse the repository at this point in the history
V0.7.0
  • Loading branch information
goodmami authored Jun 9, 2021
2 parents 44d01d4 + 9ed1502 commit f587a33
Show file tree
Hide file tree
Showing 29 changed files with 1,966 additions and 175 deletions.
44 changes: 44 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,42 @@

## [Unreleased]

## [v0.7.0]

**Release date: 2021-06-09**

### Added

* Support for approximate word searches; on by default, configurable
only by instantiating a `wn.Wordnet` object ([#105])
* `wn.morphy` ([#19])
* `wn.Wordnet.lemmatizer` attribute ([#8])
* `wn.web` ([#116])
* `wn.Sense.relations()` ([#82])
* `wn.Synset.relations()` ([#82])

### Changed

* `wn.lmf.load()` now takes a `progress_handler` parameter ([#46])
* `wn.lmf.scan_lexicons()` no longer returns sets of relation types or
lexfiles; `wn.add()` now gets these from loaded lexicons instead
* `wn.util.ProgressHandler`
- Now has a `refresh_interval` parameter; updates only trigger a
refresh after the counter hits the threshold set by the interval
- The `update()` method now takes a `force` parameter to trigger a
refresh regardless of the refresh interval
* `wn.Wordnet`
- Initialization now takes a `normalizer` parameter ([#105])
- Initialization now takes a `lemmatizer` parameter ([#8])
- Initialization now takes a `search_all_forms` parameter ([#115])
- `Wordnet.words()`, `Wordnet.senses()` and `Wordnet.synsets()` now
use any specified lemmatization or normalization functions to
expand queries on word forms ([#105])

### Fixed

* `wn.Synset.ili` for proposed ILIs now works again (#117)


## [v0.6.2]

Expand Down Expand Up @@ -311,6 +347,7 @@ the https://github.com/nltk/wordnet/ code which had been effectively
abandoned, but this is an entirely new codebase.


[v0.7.0]: ../../releases/tag/v0.7.0
[v0.6.2]: ../../releases/tag/v0.6.2
[v0.6.1]: ../../releases/tag/v0.6.1
[v0.6.0]: ../../releases/tag/v0.6.0
Expand All @@ -325,9 +362,12 @@ abandoned, but this is an entirely new codebase.
[unreleased]: ../../tree/main

[#7]: https://github.com/goodmami/wn/issues/7
[#8]: https://github.com/goodmami/wn/issues/8
[#15]: https://github.com/goodmami/wn/issues/15
[#17]: https://github.com/goodmami/wn/issues/17
[#19]: https://github.com/goodmami/wn/issues/19
[#23]: https://github.com/goodmami/wn/issues/23
[#46]: https://github.com/goodmami/wn/issues/46
[#47]: https://github.com/goodmami/wn/issues/47
[#58]: https://github.com/goodmami/wn/issues/58
[#59]: https://github.com/goodmami/wn/issues/59
Expand All @@ -348,6 +388,7 @@ abandoned, but this is an entirely new codebase.
[#78]: https://github.com/goodmami/wn/issues/78
[#79]: https://github.com/goodmami/wn/issues/79
[#81]: https://github.com/goodmami/wn/issues/81
[#82]: https://github.com/goodmami/wn/issues/82
[#83]: https://github.com/goodmami/wn/issues/83
[#86]: https://github.com/goodmami/wn/issues/86
[#87]: https://github.com/goodmami/wn/issues/87
Expand All @@ -362,3 +403,6 @@ abandoned, but this is an entirely new codebase.
[#105]: https://github.com/goodmami/wn/issues/105
[#106]: https://github.com/goodmami/wn/issues/106
[#108]: https://github.com/goodmami/wn/issues/108
[#115]: https://github.com/goodmami/wn/issues/115
[#116]: https://github.com/goodmami/wn/issues/116
[#117]: https://github.com/goodmami/wn/issues/117
2 changes: 2 additions & 0 deletions docs/api/wn.constants.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ Sense Relations
- ``other``


.. _parts-of-speech:

Parts of Speech
---------------

Expand Down
106 changes: 106 additions & 0 deletions docs/api/wn.morphy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@

wn.morphy
=========

.. automodule:: wn.morphy

.. seealso::

The Princeton WordNet `documentation
<https://wordnet.princeton.edu/documentation/morphy7wn>`_ describes
the original implementation of Morphy.

The :doc:`../guides/lemmatization` guide describes how Wn handles
lemmatization in general.


Initialized and Uninitialized Morphy
------------------------------------

There are two ways of using Morphy in Wn: initialized and
uninitialized.

Unintialized Morphy is a simple callable that returns lemma
*candidates* for some given wordform. That is, the results might not
be valid lemmas, but this is not a problem in practice because
subsequent queries against the database will filter out the invalid
ones. This callable is obtained by creating a :class:`Morphy` object
with no arguments:

>>> from wn import morphy
>>> m = morphy.Morphy()

As an uninitialized Morphy cannot predict which lemmas in the result
are valid, it always returns the original form and any transformations
it can find for each part of speech:

>>> m('lemmata', pos='n') # exceptional form
{'n': {'lemmata'}}
>>> m('lemmas', pos='n') # regular morphology with part-of-speech
{'n': {'lemma', 'lemmas'}}
>>> m('lemmas') # regular morphology for any part-of-speech
{None: {'lemmas'}, 'n': {'lemma'}, 'v': {'lemma'}}
>>> m('wolves') # invalid forms may be returned
{None: {'wolves'}, 'n': {'wolf', 'wolve'}, 'v': {'wolve', 'wolv'}}


This lemmatizer can also be used with a :class:`wn.Wordnet` object to
expand queries:

>>> import wn
>>> ewn = wn.Wordnet('ewn:2020')
>>> ewn.words('lemmas')
[]
>>> ewn = wn.Wordnet('ewn:2020', lemmatizer=morphy.Morphy())
>>> ewn.words('lemmas')
[Word('ewn-lemma-n')]

An initialized Morphy is created with a :class:`wn.Wordnet` object as
its argument. It then uses the wordnet to build lists of valid lemmas
and exceptional forms (this takes a few seconds). Once this is done,
it will only return lemmas it knows about:

>>> ewn = wn.Wordnet('ewn:2020')
>>> m = morphy.Morphy(ewn)
>>> m('lemmata', pos='n') # exceptional form
{'n': {'lemma'}}
>>> m('lemmas', pos='n') # regular morphology with part-of-speech
{'n': {'lemma'}}
>>> m('lemmas') # regular morphology for any part-of-speech
{'n': {'lemma'}}
>>> m('wolves') # invalid forms are pre-filtered
{'n': {'wolf'}}

In order to use an initialized Morphy lemmatizer with a
:class:`wn.Wordnet` object, it must be assigned to the object after
creation:

>>> ewn = wn.Wordnet('ewn:2020') # default: lemmatizer=None
>>> ewn.words('lemmas')
[]
>>> ewn.lemmatizer = morphy.Morphy(ewn)
>>> ewn.words('lemmas')
[Word('ewn-lemma-n')]

There is little to no difference in the results obtained from a
:class:`wn.Wordnet` object using an initialized or uninitialized
:class:`Morphy` object, but there may be slightly different
performance profiles for future queries.


Default Morphy Lemmatizer
-------------------------

As a convenience, an uninitialized Morphy lemmatizer is provided in
this module via the :data:`morphy` member.

.. data:: morphy

A :class:`Morphy` object created without a :class:`wn.Wordnet`
object.


The Morphy Class
----------------

.. autoclass:: Morphy
2 changes: 1 addition & 1 deletion docs/api/wn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ Wn's data storage and retrieval can be configured through the

.. seealso::

:doc:`../guides/setup` describes how to configure Wn using the
:doc:`../setup` describes how to configure Wn using the
:data:`wn.config` instance.

.. autodata:: config
Expand Down
Loading

0 comments on commit f587a33

Please sign in to comment.