Review Comment doc - build 1 functional requirements draft…
11 pages
English

Review Comment doc - build 1 functional requirements draft…

-

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
11 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

TMC APPLICATIONS OF ARCHIVED DATA OPERATIONAL TEST - ADMS VIRGINIA BUILD 1 – DRAFT FUNCTIONAL REQUIREMENTS Evaluation Team Review Summary April 9, 2003 Prepared for: Submitted by: US Department of Transportation SAIC Evaluation Team ITS Joint Program Office, HVH-1 SAIC, CSI, TRAC Table of Contents 1. Introduction..................................................................................................................... 1 2. Scope and Organization of Review Comments .............................................................. 1 3. Evaluation Points ............................................................................................................1 3.1 Likely Evaluation Focus ........................................................................................... 3 4. Technical Comments and Questions............................................................................... 3 5. Miscellaneous Comments and Suggestions.................................................................... 9 i 1. Introduction This technical memorandum documents the results of the Science Applications International Corporation (SAIC) Evaluation Team’s review of the draft Build 1 - Functional Requirements document prepared by the ADMS Virginia project team. This review has been prepared to support the FHWA sponsored evaluation of the TMC Applications of Archived Data ...

Informations

Publié par
Nombre de lectures 11
Langue English

Extrait

TMC APPLICATIONS OF ARCHIVED DATA
OPERATIONAL TEST - ADMS VIRGINIA
BUILD 1 – DRAFT FUNCTIONAL REQUIREMENTS
Evaluation Team Review Summary
April 9, 2003
Prepared for:
Submitted by:
US Department of Transportation
SAIC Evaluation Team
ITS Joint Program Office, HVH-1
SAIC, CSI, TRAC
i
Table of Contents
1. Introduction..................................................................................................................... 1
2. Scope and Organization of Review Comments .............................................................. 1
3. Evaluation Points ...........................................................................................................
.
1
3.1 Likely Evaluation Focus ........................................................................................... 3
4. Technical Comments and Questions............................................................................... 3
5. Miscellaneous Comments and Suggestions.................................................................... 9
1
1. Introduction
This technical memorandum documents the results of the Science Applications
International Corporation (SAIC) Evaluation Team’s review of the draft
Build 1 -
Functional Requirements
document prepared by the ADMS Virginia project team. This
review has been prepared to support the FHWA sponsored evaluation of the TMC
Applications of Archived Data Operational Test (hereafter know as the ADMS Virginia
Project) under contract DTFH61-02-C-00061.
This document represents a compilation of comments from the SAIC Evaluation Team.
This team consists of staff from SAIC, Cambridge Systematics (CSI) and the Washington
State Transportation Center (TRAC). Duplication in this document serves to further
emphasize observations or comments raised by all members of the Evaluation Team.
2. Scope and Organization of Review Comments
Comments on the draft functional requirements document for Build 1 focused on
technical issues affecting the system as well as potential key evaluation points. Issues
regarding presentation such as grammar were not addressed.
Comments are presented in three main sections: Evaluation Points, Technical Comments
and Questions, and Miscellaneous Suggestions and Comments.
3. Evaluation Points
In the list of expected queries (Section 3.1.2.1), it was noted that the system does not do
anything about comparing "today's" condition versus "normal" conditions. Yet, this is
what practitioners will want to know, from an operational perspective. For example, is
today especially bad or good? (in which case I, as an operator, might want to take some
action.) This implies that they need a function that computes "normal" and another
function that allows them to compare a given time period "today" or some "time period of
interest" against “normal.”
1) If the goal is operational support, this seems to be the type of analysis that DOTs will
be interested in having. A big question is whether this system will have the timeliness
needed by the operations folks. Being able to study what happened yesterday is good,
but not as good as being able to study "what's happening today?" Thus a good evaluation
point for the Evaluation Team may be: "is this system helping the operators manage their
system better in real time, or only as an "after the fact" analytical tool?”
2) How flexible is the system? Does it feed other analytical procedures? (The current
design seems to BE the analytical process, rather a combination of being an analytical
process, and doing the data aggregation/reporting work needed to feed data into a wide
variety of other analytical procedures. (Maybe this is covered under "outputs in 3.1.2.8,
but they really need both plots and numbers at the same time.)
3) Is the system scalable? How easily does the system expand or move to new areas of
the state? (This also fits under the buzz word "configuration management.") The current
2
system may have configuration management issues if they can't add stations easily and
then incorporate those new stations into their reporting systems.
4) Are they making use of all of the data being collected? Are they getting an accurate
picture of freeway performance, given limitations in loop performance? (What happens
when some loops at a station report correctly, and others do not?)
5) Section 3.2.1.4 Again the users really need both numbers and graphics. Usually (but
not always) together.
6) Section 3.2.2.2.4.1.15: Are they building an analytical system that allows them to
perform "SELECT IF" analyses? This is a good thing, I'm just a little concerned that they
will be asking for more complexity than they realize.
7) Section 3.2.2.4: Can the incident data be used to select corridor flow and performance
data? How directly are the incident and roadway performance tables linked? (Can I ask
the system to select all AM peak period data for a given corridor for all days that have
incidents occurring during the AM peak period?) The better linked the two tables, the
more useful they are analytically. I get the impression the two data sets are almost totally
independent of each other, which means you need to manually perform any linkage
between the two datasets.
8) Do the users really understand what they are asking for, and what they are getting in
terms of data/results:
Project Team needs to tell users that when the user asks for "corridor volume" or
"corridor segment volume" they are going to get "the average volume across all
stations within the defined segment." That may or may not be what the user
wants. (I might be more inclined to want the maximum volume or the VMT,
which would then equate to a volume weighted by roadway segment distance.)
Their approach is not "wrong", its just one of many ways of doing this, and the
user needs to be told what way is being used up front. I did NOT see that
definition in the earlier portions of the document.
10) Regarding Appendix C: The Evaluation Team had several major reactions:
Data imputation. They are doing a straight average of available data. This is fine
if volumes are constant. But volumes aren't constant. As a first pass, this
approach is fine. However, as time and money permits, a more sophisticated
procedure might be more effective. Will the system be designed in a modular
way to allow a more robust interpolation system to replace the initial system?
(Note that the same question applies to "across lanes" interpolation as well as
temporally, but this document does not discuss at all the "across lanes" aspects of
missing data.) If a change in missing value interpolation occurs, the
“interpolated” data will be different after the change than they were before the
change. Will old interpolated data be revised to “better” estimates? If so, how
will this be noted in the database? How will users be informed of these changes?
3
How are speeds computed? (are these dual loop sites, or are they computing
speeds from a single loop using a "G" factor, and if so, which factor? (This needs
to be explained to the user in some fashion.)
What happens in a corridor or corridor segment request, when an entire station
has bad data?
3.1 Likely Evaluation Focus
Based on the document, it appears that Build 1 can be evaluated in terms of:
Data structure/formats
Accessibility/Usability from the users’ perspectives
Quality control procedures
Imputation procedures
4. Technical Comments and Questions
Comments on the draft functional requirements document are provided in Table 1, below.
#
Technical Comments and Questions
Source
Response/
Disposition
1
Page 3. Assuming that “stations” are where traffic
data are represented for all lanes in a direction
(since number of lanes is a data element in the
station record), what is the relationship to
individual detectors? Are the data reported from
the field by detector or are they aggregated across
all lanes either in the field or immediately prior to
entry into the ADMS? Will the ADMS have the
ability to save data by lane?
CSI
2
Page 7. No real issue here, just wanted to note that
legacy data mapping and migration is the place
where many projects run into problems. As is
standardizing domain values from multiple legacy
systems. This needs to be well documented and
kept on track.
SAIC
3
Page 24. Each relationship should have a verb
phrase describing the relationship from (at least)
parent to child.
SAIC
4
Page 25. It looks like there is a single table to hold
real-time, dynamic traffic info at possible 1-minute
intervals. Once all the traffic stations in the
Hampton area get included (not to mention
additional areas in the state) – it is not clear how
the system will handle contention. Especially if
everyone is recording at 1-minute intervals.
SAIC
Table 1. Technical Comments and Questions
4
5
Page 30. Shouldn’t the Location_Code in this table
be a foreign key from the Location_Code Table and
the 19 values that are valid for this table be a subset
of the values in the Location_Code Table in the
Incident Data Base? If they are not the same domain
as the Incident Database Location_Code Table, then
the column and its value table, should be renamed to
avoid ambiguity. This is true for all look-up code
and value tables.
SAIC
6
1. Page 33. As drawn, the Incident Table key
appears incomplete. Except for
Location_Codes, the other tables don’t have
keys indicated. If the idea is that TMS Call
Number is the key for all these tables, then
the ER diagram may not be correct as
drawn. The “INC_xxx” tables should
probably be associative tables representing
the resolution of a many-to-many
relationship between, for example, one or
more incidents and one or more Agencies.
The relationship would be 1-m from
Incident to INC_Agency from an Agency
table to INC_Agency.
2. Relationships should be named.
3. Table names should be singular.
SAIC
7
Page 34. General comment re all the tables:
Ideally, primary keys should not be intelligent.
SAIC
8
1. Page 35. An agency responds to more than
one Incident and an Incident is responded to
by more than one Agency. This should be
an Associative table with both columns
being the PK for the table and each column
also being an FK from its parent tables
(Incident and Agency). Agency Name
should be replaced by Agency Code, which
should be the PK of an Agency table.
Agency Name is a column in the Agency
table.
2. Comment 1. also applies to the
“INC_Assist” table.
SAIC
Table 1. Technical Comments and Questions (Cont.)
5
9
1. Page 36. If no unique identifier for an
automobile is being stored (VIN nbr, license
nbr, etc.) and since none of the fields except
TMS Call Number will be used for this
project, perhaps an “Automobile Sequence
Nbr (start with one and increment) to record
each instance of an automobile involved in
an incident should be created. The
relationship should be 1-to-many from
Incident to INC_Automobile. The PK of
INC_Automobile would be TMS Call
Number abd Automobile Sequence Nbr.
2. If there is a true one-to-one relationship
between INC_Roadway and Incident, then
shouldn’t all the columns in INC_Roadway
be included in the Incident table (and the
INC_Roadway table deleted)?
3. In any event, the ER Diagram shows a 1-M
relationship between INC_Roadway and
Incident making the diagram more accurate.
It seems there should be a 1-M relationship
between a (to be defined) ROADWAY table
and the Incident table since although an
Incident can occur on only one Roadway, a
Roadway can be the site of more than one
Incident, therefore, Roadway should be an
FK in the Incident table.
SAIC
10
Page 19. Section 3.2.1.4 Outputs: “Along with any
traffic data output, … the percentage of good
records… .” Will the percentage of good records
used include those derived through imputation as
well as those showing 0.
SAIC
11
Corridors – it would be good if the lengths of the
corridors can be explicitly specified. One could
compute them from the mileposts of the inclusive
stations, but it would make it easier if they were
explicit.
CSI
12
Corridor Sections – could more information be
provided on these and their use? It appears that the
incident data structure doesn’t use them
(route/direction/milepost seems to be used for
incident location). Or are these supposed to be
related to LOCATION_CODE, the broad area
definition used in the incident data structure?
CSI
Table 1. Technical Comments and Questions (Cont.)
6
13
Station Location – some indication of where the
detector is located relative to interchange
configuration would be helpful. Examples would
be: immediately upstream of an exit ramp, between
an exit and entrance ramp, downstream of an
entrance ramp, in middle of a known weaving
section, etc. For volumes especially, it makes a big
difference. The data element DESCRIPTION may
be a place to hold this, but it is currently free form.
CSI
14
Incident Entity Relationships – The incident data
being collected is very valuable. We’ve found this
to be a large hole in TMC data. A suggestion:
would it be useful to transfer the data elements
LANE_CLOSE_START and LANE CLOSE_END
to the
INC_ROADWAY
table? This way you
could capture when the number of lanes blocked
changes during the course of an incident (e.g., ER
crews closing an additional lane for clearance
activities). This assumes that the data input is set
up this way, though, and it is extra work for the
operators.
CSI
15
Good Data Availability – when users request a file
(e.g., CSV format), will this information be
included in the individual records? For example, if
a user wants 15-minute aggregations, will there also
be three additional fields, one each for volume,
speed, and occupancy, indicating the percent of
good 1-minute data represented?
CSI
16
Section 1.7: Definitions: The report says that the
corridor definitions are fixed. That is fine for now,
but I would hope that there are future plans for
expansion of the surveillance system, especially
since this system is supposed to be applied
statewide. So, rather than having a "fixed" corridor
length, shouldn't this be "semi-fixed" (meaning the
definition is "fixed," but can be altered through a
specific menu. This has software design issues.
1) The system needs a mechanism (table) to hold
the "current" end points.
2) The list of stations needs to be able to grow over
time.
3) The output reports need to list what the
(variable) beginning and ending points are.
(This area has lots of implications as this database
is implemented in a large number of other
geographic regions, especially ones where the
geographic coverage of the surveillance system is
expanding.)
TRAC
7
17
Regarding predetermined AM and PM peak period
definitions. Have these values been agreed to
statewide? (Some use a four hour PM peak, which
is probably applicable in Northern Virginia, even if
it is too long for Hampton Roads.)
TRAC
18
Section 2.3.3 Data Imputation: FIRST, does the
STL store data by lane or only by STATION?
How is SPEED being computed (Are these dual
loops or single loops?) These questions are
important in the determination of data quality and
the resulting discussion of data imputation. As best
I can tell from the Functional Requirements
document, they are working with data for 1-minute
for all lanes in a direction ("station data"), but that
level of data is made up of a much finer grain of
detail (loop data by lane), and not all loops at a
location may be operating. So, how do you know
whether that is the case? If not all loops at a
location are operating (say 2 out of 3 are working
correctly), what happens? How is this reported to
the STL? This will have a huge impact on the need
for imputation as well as the best way of
performing that imputation.
TRAC
19
Section 3.1.2.8.4.1.9: Saving the picture is great,
but the user also needs to be able to get the
numbers (data) that underlie the image. We've
found that we most often "print" the picture, but
that for every picture we print, we get four requests
for what the numbers are, because they are being
used to feed some other analytical process. (For
example, we plot volumes by time of day, but
people take our 5-minute volume numbers and
compute a wide variety of other statistics from
them.)
TRAC
20
Section 3.2.1.2.4 Additional Inputs: The system
needs to be able to handle a mix of good and/bad
data. What happens when Station 1 has 5 days of
good data, but Station 2 only has four? If I select
"Use raw data" does my "analyze the week" request
get rejected, or does it ignore the fifth day for
station 2?
TRAC
Table 1. Technical Comments and Questions (Cont.)
8
21
Appendix A: OCCUPANCY definition: We
(WSDOT/TRAC) have found that integer
lane occupancy statistics cause us problems with
the calculation of speed estimates because of the
lack of precision in the occupancy data. If they
could make this a floating point number, it might be
better. (or move the decimal point, and simply
carry a larger integer value.)
TRAC
22
App. A. cont. SPEED: Where is this value coming
from? We've also found (as has PeMS) that speed
varies considerably across lanes (especially in
congested conditions near ramp terminals), and that
there are good reasons to keep data by lane for that
reason. This highlights the "station specific" design
of the database (at least I THINK it is storing data
only by STATION, and not by lane.)
TRAC
23
App. A. cont.: SCREENING_TESTS - There are
many data we have that fall neither into "good" or
"bad" categories. We've been very glad to have a
"questionable" category. They might want to
consider that as well.
TRAC
24
Page 26: Have they considered "post processing"
their old data so that they have a
"SCREENING_TESTS" value for data submitted
prior to April 2002? This would certainly make the
"earlier" data more useful analytically. (Note that
this could become an Evaluation Point: Are the
earlier data worth using? Especially if they don't
have any QA/QC flags?)
TRAC
25
Page 29: MILE_MARKER: Can they use their
mile markers as distance measures? Are they
accurate? (Ours at WSDOT are OK, but have
definite limitations when we use them to compute
corridor distances.)
TRAC
26
Page 29: The variable TOTAL_LANES: describes
the number of lanes present, but there is no variable
telling the total number of lanes reporting data. If
all lanes don't report data, is the remaining data
tossed as "bad?" Is the missing lane "imputed?"
Who does this? When? How is it recorded? If 2
of 3 lanes are missing, does that change their
process?
TRAC
27
Page 31: HOLIDAY. What about Martin Luther
King's birthday Monday, or Presidents Day
Monday?
TRAC
Table 1. Technical Comments and Questions (Cont.)
9
5. Miscellaneous Comments and Suggestions
1. One of the main issues raised in the document is to what degree will new
applications be developed as opposed to simply providing data and reports for
existing processes. It seems like most will fall in the latter category.
2. Are records being kept of the labor hours required to develop and manage the
project, preferably by labor category? This would include the requirements
definition/design as well as the actual coding and testing.
3. Regarding imputation, would it be useful to categorize historical data by
incidents/ nonincidents? Then, if you could determine that a lane blocking
incident was affecting flow, you could pull from the days with an incident? This
may be too complicated for Build 1.
4. Any plans for capturing work zone characteristics?
5. Another idea is to compute averages for each day of week individually rather than
just weekday/weekend. Actually, you’d only have to do this if the day-to-day
variability is significant.
6. Still another idea is based on some preliminary work the evaluation team did with
Detroit data (volume only). Instead of historical averages, historical growth rates
representing the average percent growth between successive time periods was
used. This takes advantage of any data that exist prior to the detection of “bad”
data”. As more and more bad data are identified, however, you lose this
advantage.
7. Suggest you add a data element in the
INC_AUTOMOBILE
table to capture
vehicle type. Make the codes correspond to those used by MCMIS for safety
reporting:
1
Bus (>15 seats including driver)
2
Single unit truck (2 axle, 6 tire)
3
Single unit truck (3 or more axles)
4
Truck/trailer
5
Truck tractor (bobtail)
6
Tractor/semi-trailer
7
Tractor/doubles
8
Tractor/triples
9
Heavy truck, unclassified
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents