1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
|
= Code Import Jobs =
A CodeImportJob is a record of a pending or running code import job.
CodeImports are hidden from regular users currently. David Allouche is a
member of the vcs-imports team and can access the objects freely.
>>> login('david.allouche@canonical.com')
They can be accessed via a utility registered for the ICodeImportJobSet
interface.
>>> from canonical.launchpad.webapp.testing import verifyObject
>>> from lp.code.interfaces.codeimportjob import ICodeImportJobSet
>>> job_set = getUtility(ICodeImportJobSet)
>>> verifyObject(ICodeImportJobSet, job_set)
True
The code-import-worker scripts are attached to specific job objects and
retrieve jobs by database id using the CodeImportJobSet.getById method.
>>> from lp.code.interfaces.codeimportjob import ICodeImportJob
>>> verifyObject(ICodeImportJob, job_set.getById(1))
True
The webapp gets the current job for display using the
CodeImport.import_job property.
CodeImportJob objects can also be retrieved using the import_job
property of a CodeImport object. It is useful for the webapp to display
the current job of a given CodeImport.
>>> from lp.code.interfaces.codeimport import ICodeImportSet
>>> code_import = getUtility(ICodeImportSet).get(1)
>>> verifyObject(ICodeImportJob, code_import.import_job)
True
The life cycle of a CodeImportJob involves the creation of other objects
at various points. To enforce this, CodeImportJob objects are only
modified using the CodeImportJobWorkflow utility.
>>> from lp.code.interfaces.codeimportjob import ICodeImportJobWorkflow
>>> workflow = getUtility(ICodeImportJobWorkflow)
>>> verifyObject(ICodeImportJobWorkflow, workflow)
True
== Sample data of interest ==
There are two CodeImport objects of interest in the sample data.
>>> from lp.code.interfaces.branchlookup import IBranchLookup
>>> from lp.code.interfaces.codeimport import ICodeImportSet
>>> branch_lookup = getUtility(IBranchLookup)
>>> code_import_set = getUtility(ICodeImportSet)
One has review_status set to NEW.
>>> new_import_branch = branch_lookup.getByUniqueName(
... '~vcs-imports/evolution/import')
>>> new_import = code_import_set.getByBranch(new_import_branch)
>>> print new_import.review_status.name
NEW
The other one has review_status set to REVIEWED.
>>> reviewed_import_branch = branch_lookup.getByUniqueName(
... '~vcs-imports/gnome-terminal/import')
>>> reviewed_import = code_import_set.getByBranch(reviewed_import_branch)
>>> print reviewed_import.review_status.name
REVIEWED
Some workflow methods expect the user that is requesting the action. We
use the No Privileges Person, regardless of what privileges may be
required to initiate the action.
>>> from lp.registry.interfaces.person import IPersonSet
>>> person_set = getUtility(IPersonSet)
>>> nopriv = person_set.getByName('no-priv')
== Test helpers ==
The print_date_attribute function displays a date attribute of an
object. If the value of the attribute is equal to the "UTC_NOW" time
of the current transaction, it prints the string "UTC_NOW" instead of
the actual time value.
>>> from canonical.launchpad.ftests import print_date_attribute
The NewEvents class helps testing the creation of CodeImportEvent
objects.
>>> from lp.code.model.tests.test_codeimportjob import (
... NewEvents)
== Testing whether a job is overdue ==
CodeImportJob objects have a date_due attribute that specifies when the
job should ideally be started. If the date_due is in the past, the job
is said to be overdue, and will be run as soon as possible.
The CodeImportJob.isOverdue() method tells whether a job is overdue.
>>> from datetime import datetime
>>> from pytz import UTC
>>> import_job = reviewed_import.import_job
>>> from zope.security.proxy import removeSecurityProxy
>>> def set_date_due(import_job, date):
... # ICodeImportJob does not allow setting date_due, so we must use
... # removeSecurityProxy to set it.
... removeSecurityProxy(import_job).date_due = date
If date_due is in the future, then the job is not overdue.
>>> future_date = datetime(2100, 1, 1, tzinfo=UTC)
>>> set_date_due(import_job, future_date)
>>> import_job.isOverdue()
False
If date_due is in the past, then the job is overdue.
>>> past_date = datetime(1900, 1, 1, tzinfo=UTC)
>>> set_date_due(import_job, past_date)
>>> import_job.isOverdue()
True
Owing to the fleeting nature of time, if date_due is the time of the
current transaction, then the job is overdue.
>>> from canonical.database.constants import UTC_NOW
>>> set_date_due(import_job, UTC_NOW)
>>> import_job.isOverdue()
True
== Creating a new job ==
CodeImportJob objects are created using the CodeImportJobWorkflow.newJob
method.
In normal use, the only case where a job object is created explicitly is
when the review status of a code import is modified. This case is
handled by the CodeImport.updateFromData method.
When the review status an import changes to REVIEWED, an associated job
is created.
>>> from lp.code.enums import CodeImportReviewStatus
>>> unproxied_new_import = removeSecurityProxy(new_import)
>>> unproxied_new_import.review_status = CodeImportReviewStatus.REVIEWED
>>> new_job = workflow.newJob(new_import)
>>> print new_import.import_job
<security proxied ...CodeImportJob instance at 0x...>
Jobs are always created in PENDING state.
>>> print new_job.state.name
PENDING
If the associated code import has never been run, its date due is set to
UTC_NOW, so it will be run as soon as possible.
>>> print_date_attribute(new_job, 'date_due')
UTC_NOW
When the code import is associated to existing CodeImportResult objects,
the date due may be UTC_NOW or a timestamp in the future. This is
covered in detail in the test_codeimportjob.py file.
== Deleting a pending job ==
In normal use, the only case where a job object is deleted explicitly is
when the review status of a code import is modified. This case is
handled by the CodeImport.updateFromData method.
When the review status of an import changes from REVIEWED, and the
associated job is not running, the job is deleted.
>>> unproxied_new_import.review_status = CodeImportReviewStatus.INVALID
>>> workflow.deletePendingJob(new_import)
>>> print new_import.import_job
None
== Requesting a job run ==
When a job is pending, users can request that it be run as soon as
possible.
>>> from datetime import datetime
>>> from pytz import UTC
>>> pending_job = reviewed_import.import_job
>>> future_date = datetime(2100, 1, 1, tzinfo=UTC)
>>> # ICodeImportJob does not expose date_due,
>>> # so we must use removeSecurityProxy.
>>> removeSecurityProxy(pending_job).date_due = future_date
>>> new_events = NewEvents()
>>> workflow.requestJob(pending_job, nopriv)
This records the requesting user in the job object and sets its date due
for running as soon as possible.
>>> print pending_job.requesting_user.name
no-priv
>>> print_date_attribute(pending_job, 'date_due')
UTC_NOW
The job request is also recorded in the CodeImportEvent audit trail.
>>> print new_events.summary()
REQUEST ~vcs-imports/gnome-terminal/import no-priv
Once a job has been requested by a user, it cannot be requested a
second time until the job runs and terminates. This means that any
Launchpad web application code that is going to call requestJob must
first check the status and if the job has already been requested by
another user, present a message explaining that this has happened.
>>> workflow.requestJob(pending_job, nopriv)
Traceback (most recent call last):
...
AssertionError: The CodeImportJob associated with
~vcs-imports/gnome-terminal/import was already requested by no-priv.
== Starting a job ==
When a job is about to performed by a code import worker, the startJob
workflow method updates the job's fields to indicate that it is now
running and which machine it is running on.
>>> from lp.code.interfaces.codeimportmachine import ICodeImportMachineSet
>>> machine_set = getUtility(ICodeImportMachineSet)
>>> machine = machine_set.getByHostname('bazaar-importer')
>>> new_events = NewEvents()
This method updates the date_started, heartbeat and state fields of
the job. Before the call, date_started, heartbeat, logtail and
machine are NULL and the state is PENDING.
>>> print_date_attribute(pending_job, 'date_started')
None
>>> print_date_attribute(pending_job, 'heartbeat')
None
>>> print pending_job.logtail
None
>>> print pending_job.machine
None
>>> print pending_job.state.name
PENDING
After the call, the date_started and heartbeat fields are both updated
to the current time, the logtail is the empty string, machine is set
to the supplied import machine and the state is RUNNING.
>>> workflow.startJob(pending_job, machine)
>>> print_date_attribute(pending_job, 'date_started')
UTC_NOW
>>> print_date_attribute(pending_job, 'heartbeat')
UTC_NOW
>>> pending_job.logtail
u''
>>> print pending_job.machine.hostname
bazaar-importer
>>> print pending_job.state.name
RUNNING
>>> running_job = pending_job
The event is also recorded in the CodeImportEvent audit trail.
>>> print new_events.summary()
START ~vcs-imports/gnome-terminal/import bazaar-importer
== Recording progress on a job ==
As the code import worker progresses, it calls the updateHeartbeat
method at least every minute to indicate that it is still progressing.
This allows the situations where a machine falls off the network,
becomes starved of RAM and starts thrashing badly or similar to be
detected.
As updateHeartbeat updates the 'heartbeat' field of the job to the
current transaction time, we force a date in the a past into this
field now so that we can check that updateHeartbeat has an effect.
>>> removeSecurityProxy(running_job).heartbeat = \
... datetime(2007, 1, 1, 0, 0, 0, tzinfo=UTC)
>>> from canonical.launchpad.ftests import sync
>>> sync(running_job)
>>> new_events = NewEvents()
As stated above updateHeartbeat updates the 'heartbeat' field to the
current transaction time and also takes a 'logtail' parameter which is
intended to be displayed in the web UI to give the operators some idea
what the import worker is currently doing for this job.
>>> print_date_attribute(running_job, 'heartbeat')
2007-01-01 00:00:00+00:00
>>> running_job.logtail
u''
>>> workflow.updateHeartbeat(running_job, u'some interesting log output')
>>> print_date_attribute(running_job, 'heartbeat')
UTC_NOW
>>> running_job.logtail
u'some interesting log output'
No code import events are generated by this method.
>>> new_events.summary()
''
== Finishing a job ==
When a job finishes, the code import worker records this fact by
calling the finishJob workflow method, which is responsible for all of
the housekeeping associated with the end of an attempt to update a
code import, successful or not:
- creating a CodeImportResult record for the job run,
- deleting the row in the database for the now finished run and
creating a new one for the next run, and
- logging a FINISH CodeImportEvent.
The method takes a running job, a status code indicating whether the
job completed successfully or not and an optional link to the log of
the import run in the librarian.
Also, in the successful case, finishJob calls requestMirror() on the
import branch so that the newly imported revisions can be pulled into
the code hosting area.
In this example, the import branch has never been marked as needing
mirroring, so the 'next_mirror_time' field is empty:
>>> print_date_attribute(code_import.branch, 'next_mirror_time')
None
We just document the successful case here, when a log is not recorded.
The details are tested in unit tests in
../database/tests/test_codeimportjob.py.
>>> new_events = NewEvents()
>>> finished_job_id = running_job.id
>>> finished_date_due = running_job.date_due
>>> from lp.code.enums import CodeImportResultStatus
>>> workflow.finishJob(
... running_job, CodeImportResultStatus.SUCCESS, None)
The passed in job is now deleted.
>>> print job_set.getById(finished_job_id)
None
And a new one has been created, scheduled appropriately far in the
future.
>>> code_import.import_job.id != finished_job_id
True
>>> code_import.effective_update_interval
datetime.timedelta(0, 21600)
>>> code_import.import_job.date_due - finished_date_due
datetime.timedelta(0, 21600)
A CodeImportResult object has now been created to record the result of
this run, containing details such as the import worked on, the dates
the import started and finished and the final status of the run.
>>> results = list(code_import.results)
>>> len(results)
1
>>> [result] = results
>>> result.code_import.id
1
>>> print_date_attribute(result, 'date_job_started')
UTC_NOW
>>> # The python-level 'date_job_finished' field is punned with the
>>> # date_created database column.
>>> print_date_attribute(result, 'date_created')
UTC_NOW
>>> print result.status.name
SUCCESS
And because we're pretending that this was as successful run, the
branch is now due to be mirrored by branch puller:
>>> print_date_attribute(code_import.branch, 'next_mirror_time')
UTC_NOW
Other details of the result object are checked in the unit tests.
Finally, the finishJob() method created a FINISH CodeImportEvent.
>>> print new_events.summary()
FINISH ~vcs-imports/gnome-terminal/import bazaar-importer
== Reclaiming a job that appears to be stuck ==
The code import worker is meant to update the heartbeat field of the
row of CodeImportJob frequently. The code import watchdog
periodically checks the heartbeats of the running jobs and if it finds
that a heartbeat was not updated recently enough, it assumes it has
become stuck somehow and 'reclaims' the job -- removes the job from
the database and creates a pending job for the same import that is due
immediately. This reclaiming is done by the 'reclaimJob' code import
job workflow method.
It just takes a running code import job as a parameter.
>>> from canonical.launchpad.testing.codeimporthelpers import (
... make_running_import)
>>> running_import = make_running_import(factory=factory)
>>> running_import_job = running_import.import_job
'reclaimJob' does four separate things:
>>> running_import_job_id = running_import_job.id
>>> new_events = NewEvents()
>>> workflow.reclaimJob(running_import_job)
1) deletes the passed in job,
>>> print job_set.getById(running_import_job_id)
None
2) creates a CodeImportResult with a status of 'RECLAIMED',
>>> results = list(running_import.results)
>>> len(results)
1
>>> [result] = results
>>> result.status.name
'RECLAIMED'
3) creates a new, already due, job for the code import, and
>>> print_date_attribute(running_import.import_job, 'date_due')
UTC_NOW
4) logs a 'RECLAIM' CodeImportEvent.
>>> print new_events.summary()
RECLAIM ...
|