~launchpad-pqm/launchpad/devel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
Checking messages against gettext
=================================

Before accepting a translation message, Launchpad runs it through
gettext to check for certain errors.  For example, if the message is
marked as a C format string in the template, using the c-format flag,
the set of % conversion specifiers in the translation should be
compatible with those of the original English string in the template.
Other languages such as Python have similar format string capabilities
with corresponding flags.

But sometimes it is possible for invalid messages to make it into the
database.  It may be due to a bug in Launchpad, or it could be due to a
gettext update that notices incompatibilities that earlier versions
didn't.


Test setup
----------

Here we use an instrumented version of the script that counts messages
checked instead of real time.  This gets around the indeterminate commit
points that would otherwise be in the output.

    >>> from zope.security.proxy import removeSecurityProxy

    >>> from lp.services.database.sqlbase import quote
    >>> from lp.translations.scripts.gettext_check_messages import (
    ...     GettextCheckMessages)
    >>> from lp.services.log.logger import FakeLogger
    >>> from lp.testing.faketransaction import FakeTransaction

    >>> class InstrumentedGettextCheckMessages(GettextCheckMessages):
    ...     _commit_interval = 3
    ...     def _get_time(self):
    ...         return self._check_count

    >>> def run_checker(options, commit_interval=None):
    ...     """Create and run an instrumented `GettextCheckMessages`."""
    ...     checker = InstrumentedGettextCheckMessages(
    ...         'gettext-check-messages-test', test_args=options)
    ...     checker.logger = FakeLogger()
    ...     checker.txn = FakeTransaction(log_calls=True)
    ...     if commit_interval is not None:
    ...         checker._commit_interval = commit_interval
    ...     checker.main()

    >>> login('foo.bar@canonical.com')

    >>> pofile = factory.makePOFile()
    >>> template = pofile.potemplate

A sample translatable message is flagged as containing a C-style format
string.  This means that any "%d" sequences and such are significant.
So gettext will check those in the translations for compatibility with
those in the original message.

    >>> potmsgset = factory.makePOTMsgSet(
    ...     potemplate=template, singular=u'%d n', sequence=1)
    >>> potmsgset.flagscomment = 'c-format'
    >>> potmsgset.flags
    [u'c-format']

The sample message has an upstream translation, and an Ubuntu
in Launchpad that differs from the upstream one.

    >>> ubuntu_message = factory.makeCurrentTranslationMessage(
    ...     pofile=pofile, potmsgset=potmsgset, translator=template.owner,
    ...     reviewer=template.owner, translations=[u'%d c'])
    >>> ubuntu_message = removeSecurityProxy(ubuntu_message)

    >>> upstream_message = factory.makeCurrentTranslationMessage(
    ...     pofile=pofile, potmsgset=potmsgset, translator=template.owner,
    ...     reviewer=template.owner, translations=[u'%d i'])
    >>> upstream_message = removeSecurityProxy(upstream_message)

    >>> upstream_message.is_current_ubuntu = False
    >>> ubuntu_message.is_current_upstream = False
    >>> upstream_message.is_current_upstream = True
    >>> ubuntu_message.is_current_ubuntu = True


Basic operation
---------------

The gettext_check_message script goes through a given set of messages
and re-does the gettext check.  Which messages it checks is specified as
a plain SQL WHERE clause.

    >>> run_checker(['-vv', "-w id=%s" % quote(ubuntu_message.id)])
    DEBUG Checking messages matching:  id=...
    DEBUG Checking message ...
    DEBUG Commit point.
    COMMIT
    INFO Done.
    INFO Messages checked: 1
    INFO Validation errors: 0
    INFO Messages disabled: 0
    INFO Commit points: ...


Detecting errors
----------------

If a translation fails to validate against its potmsgset, the script
detects the problem when it checks that message.

    >>> ubuntu_message.is_current_ubuntu
    True

    >>> from lp.services.propertycache import get_property_cache
    >>> get_property_cache(ubuntu_message).translations = [u'%s c']

    >>> run_checker(["-w id=%s" % quote(ubuntu_message.id)])
    DEBUG Checking messages matching:  id=...
    DEBUG Checking message ...
    INFO ... (ubuntu): format specifications ... are not the same
    DEBUG Commit point.
    COMMIT
    DEBUG Commit point.
    COMMIT
    INFO Done.
    INFO Messages checked: 1
    INFO Validation errors: 1
    INFO Messages disabled: 1
    INFO Commit points: ...

The failed message is demoted to a mere suggestion.

    >>> ubuntu_message.is_current_ubuntu
    False


Output
------

Besides Ubuntu messages, the script's output also distinguishes
upstream ones, and ones that are completely unused. The upstream message
happens to produce validation errors.

    >>> get_property_cache(upstream_message).translations = [u'%s %s i']

In this example we'd like to see a nicely predictable ordering, so we
add a sort order using the -o option.

    >>> run_checker(['-w', 'potmsgset=%s' % quote(potmsgset), '-o',  'id'])
    DEBUG Checking messages matching:  potmsgset=...
    DEBUG Checking message ...
    INFO ... (unused): format specifications ... are not the same
    DEBUG Commit point.
    COMMIT
    DEBUG Checking message ...
    INFO ... (upstream): number of format specifications ... does not match...
    DEBUG Commit point.
    COMMIT
    INFO Done.
    INFO Messages checked: 2
    INFO Validation errors: 2
    INFO Messages disabled: 1
    INFO Commit points: 2

The script also notes when a message is shared between upstream and Ubuntu.

    >>> upstream_message.is_current_ubuntu = True
    >>> upstream_message.is_current_upstream = True
    >>> run_checker(["-w id=%s" % quote(upstream_message.id)])
    DEBUG ...
    INFO ... (ubuntu, upstream): number of format specifications ...


Dry runs
--------

The --dry-run option makes the script abort all its database changes.

    >>> ubuntu_message.is_current_ubuntu = True

    >>> run_checker(["-w id=%s" % quote(ubuntu_message.id), '--dry-run'])
    INFO Dry run.  Not making any changes.
    DEBUG Checking messages matching:  id=...
    DEBUG Checking message ...
    INFO ... (ubuntu): format specifications ... are not the same
    DEBUG Commit point.
    ABORT
    DEBUG Commit point.
    ABORT
    INFO Done.
    INFO Messages checked: 1
    INFO Validation errors: 1
    INFO Messages disabled: 1
    INFO Commit points: 2


Commit points
-------------

To avoid long-running transactions and potential locks, the script
commits regularly.  Normally this happens every few seconds.  For the
purpose of this test we count messages checked.  If we set the commit
interval to 1, we get a commit after every message plus one at the end
to close things off neatly.

    >>> run_checker(["-w potmsgset=%s" % quote(potmsgset)], commit_interval=1)
    DEBUG Checking messages matching:  potmsgset=...
    DEBUG Checking message ...
    INFO ... (...): number of format specifications ...
    DEBUG Commit point.
    COMMIT
    DEBUG Checking message ...
    INFO ... (...): format specifications ... are not the same
    DEBUG Commit point.
    COMMIT
    DEBUG Commit point.
    COMMIT
    INFO Done.
    INFO Messages checked: 2
    INFO Validation errors: 2
    INFO Messages disabled: 0
    INFO Commit points: 3