Manual Data Collection

  

After reading the chapter by Capri (2015) on manual data collection. Answer the following questions:

What were the traditional methods of data collection in the transit system?
Why are the traditional methods insufficient in satisfying the requirement of data collection?
Give a synopsis of the case study and your thoughts regarding the requirements of the optimization and performance measurement requirements and the impact to expensive and labor-intensive nature.

Don't use plagiarized sources. Get Your Custom Essay on
Manual Data Collection
Just from $13/Page
Order Essay

In an APA7 format answer all questions above. There should be headings to each of the questions above as well. Ensure there are at least two-peer reviewed sources to support your work. The paper should be at least 2 pages of content (this does not include the cover page or reference page).

In: Data Mining ISBN: 978-1-63463-738-1

Editor: Harold L. Capri 2015 Nova Science Publishers, Inc.

Chapter 1

TRANSIT PASSENGER ORIGIN INFERENCE

USING SMART CARD DATA AND GPS DATA

Xiaolei Ma1, Ph.D. and Yinhai Wang
2
, Ph.D.

1
School of Transportation Science and Engineering,

Beihang University, Beijing, China
2
Department of Civil and Environmental Engineering,

University of Washington, Seattle, WA, US

ABSTRACT

To improve customer satisfaction and reduce operation costs, transit

authorities have been striving to monitor their transit service quality and

identify the key factors to attract the transit riders. Traditional manual

data collection methods are unable to satisfy the transit system

optimization and performance measurement requirement due to their

expensive and labor-intensive nature. The recent advent of passive data

collection techniques (e.g., Automated Fare Collection and Automated

Vehicle Location) has shifted a data-poor environment to a data-rich

environment, and offered the opportunities for transit agencies to conduct

comprehensive transit system performance measures. Although it is

possible to collect highly valuable information from ubiquitous transit

data, data usability and accessibility are still difficult. Most Automatic

Fare Collection (AFC) systems are not designed for transit performance

monitoring, and additional passenger trip information cannot be directly

Email: [emailprotected]

C
o
p
y
r
i
g
h
t

2
0
1
4
.

N
o
v
a

S
c
i
e
n
c
e

P
u
b
l
i
s
h
e
r
s
,

I
n
c
.

A
l
l

r
i
g
h
t
s

r
e
s
e
r
v
e
d
.

M
a
y

n
o
t

b
e

r
e
p
r
o
d
u
c
e
d

i
n

a
n
y

f
o
r
m

w
i
t
h
o
u
t

p
e
r
m
i
s
s
i
o
n

f
r
o
m

t
h
e

p
u
b
l
i
s
h
e
r
,

e
x
c
e
p
t

f
a
i
r

u
s
e
s

p
e
r
m
i
t
t
e
d

u
n
d
e
r

U
.
S
.

o
r

a
p
p
l
i
c
a
b
l
e

c
o
p
y
r
i
g
h
t

l
a
w
.

EBSCO Publishing : eBook Collection (EBSCOhost) – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS
AN: 956104 ; Ma, Xiaolei, Capri, Harold L..; Data Mining: Principles, Applications and Emerging Challenges
Account: s8501869.main.ehost

Xiaolei Ma and Yinhai Wang 2

retrieved. Interoperating and mining heterogeneous datasets would

enhance both the depth and breadth of transit-related studies. This study

proposed a series of data mining algorithms to extract individual transit

riders origin using transit smart card and GPS data. The primary data

source of this study comes from the AFC system in Beijing, where a

passengers boarding stop (origin) and alighting stop (destination) on a

flat-rate bus are not recorded on the check-in and check-out scan. The bus

arrival time at each stop can be inferred from GPS data, and individual

passengers boarding stop is then estimated by fusing the identified bus

arrival time with smart card data. In addition, a Markov chain based

Bayesian decision tree algorithm is proposed to mine the passengers

origin information when GPS data are absent. Both passenger origin

mining algorithms are validated based on either on-board transit survey

data or personal GPS logger data. The results demonstrates the

effectiveness and efficiency of the proposed algorithms on extracting

passenger origin information. The estimated passenger origin data are

highly valuable for transit system planning and route optimization.

Keywords: Automated fare collection system, transit GPS, passenger origin

inference, Bayesian decision tree, Markov chain

INTRODUCTION

According to the Census of 2000 in the United States, approximately 76%

people chose privately owned vehicles to commute to work in 2000 (ICF

consulting, 2003). Recent studies conducted by the 2009 American

Community Survey indicate 79.5% of home-based workers drive alone for

commuting (McKenzie and Rapino, 2009). Many developing countries, e.g.,

China, also rely on privately owned vehicles to commute. For example, more

than 34% of the Beijing residents chose cars as their primary travel mode

while only 28.2% chose transit in 2010 (Beijing Transportation Research

Center, 2012). Public transit has been considered as an effective

countermeasure to reduce congestion, air pollution, and energy consumption

(Federal Highway Administration, 2002). According to 2005 urban mobility

report conducted by Texas Transportation Institute (2005), travel delay in

2003 would increase by 27 percent without public transit, especially in those

most congested metropolitan cites of U.S., public transit services have saved

more than 1.1 billion hours of travel time. Moreover, public transit can help

enhance business, reduce city sprawl through the transit oriented development

(TDO). During certain emergency scenarios, public transit can even act as a

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Transit Passenger Origin Inference Using Smart Card Data 3

safe and efficient transportation mode for evacuation (Federal Highway

Administration, 2002). Based on the aforementioned reasons, it is of critical

importance to improve the efficiency of public transit system, and promote

more roadway users to utilize public transit. To fulfill these objectives, transit

agencies need to understand the areas where improvements can be further

made, and whether community goals are being met, etc. A well-developed

performance measure system will facilitate decision making for transit

agencies. Transit agencies can evaluate the transit ridership trends with fare

policy changes and identify where and when better transit service should be

provided. In addition, transit agencies are also required to summarize transit

performance statistics for reporting to either the National Transit Database

(Kittelson & Associates et al., 2003), or the general public who are interested

knowing how well transit service is being provided. Nevertheless, developing

a set of structured performance measures often requires a large amount of data

and the corresponding domain knowledge to process and analyze these data.

These obstacles create challenges for transit agencies to spend time and effort

undertaking. Traditionally, transit agencies heavily rely on manual data

collection methods to gather transit operation and planning data (Ma et al.,

2012). However, traditional data collection methods (e.g., travel diary, survey,

etc.) are fairly costly and difficult to implement at a multiday level due to their

low response rate and accuracy. Transit agencies have spent tremendous

manpower and resource undertaking manual data collections, and consumed a

significant amount of energy and time to post-process the raw data. With

advances in information technologies in intelligent transportation systems

(ITS), the availability of public transit data has been increasing in the past

decades, which has gradually shifted public transit system into a data-rich

paradigm. Automatic Fare Collection (AFC) system and Automatic Vehicle

Track (AVL) system are two common passive data collection methods. AFC

system, also known as Smart Card system, records and processes the fare

related information using either contactless or contact card to complete the

financial transaction (Chu, 2010). There exist two typical types of AFC

systems: entry-only AFC system and distance-based AFC system. In the entry-

only AFC system, passengers are only required to swipe their smart cards over

the card reader during boarding, while passengers need to check in and check

out during both their boarding and alighting procedures for the distance-based

AFC system. AVL and AFC technologies hold substantial promise for transit

performance analysis and management at a relative low cost. However,

historically, both AVL and AFC data have not been used to their full

potentials. Many AVL and AFC systems do not archive data in a readily

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Xiaolei Ma and Yinhai Wang 4

utilized manner (Furth, 2006). AFC system is initially designed to reduce

workloads of tedious manual fare collections, not for transit operation and

planning purposes, and thereby, certain critical information, such as specific

spatial location for each transaction, may not be directly captured. AVL

system tracks transit vehicles geospatial locations by Global Positioning

System (GPS) at either a constant or varying time interval. The accuracy of

GPS occasionally suffers from signal loss due to tall building obstructions in

the urban area (Ma et al., 2011). Both of the AFC system and AVL system

have their inherent drawbacks in monitoring transit system performance, and

require analytical approaches to eliminate the erroneous data, remedy the

missing values, and mine the unseen and indirect information.

The remainder of this paper is organized as follows: transit smart card data

and GPS data are described in the section 2. Based on these data sets, a data

fusion method is initially proposed to integrate with roadway geospatial data

to estimate transit vehicles arrival information. And then, a Bayesian decision

tree algorithm is presented to estimate each passengers boarding stop when

GPS data are unavailable. Considering the expensive computational burden of

decision tree algorithms, Markov-chain property is taken into account to

reduce the algorithm complexity. On-board survey and GPS data from the

Beijing transit system are used to test and verify the proposed algorithms.

Conclusion and future research efforts are summarized at the end of this paper.

RESEARCH BACKGROUND

Data from AFC system and AVL system are the two primary sources in

this study. Beijing Transit Incorporated began to issue smart cards in May 10,

2006. The smart card can be used in both the Beijing bus and subway systems.

Due to discounted fares (up to 60% off) provided by the smart card, more than

90% of the transit riders pay for their transit trips with their smart cards in

2010 (Beijing Transportation Research Center, 2010). Two types of AFC

systems exist in Beijing transit: flat fare and distance-based fare. Transit riders

pay at a fixed rate for those flat fare buses when entering by tapping their

smart cards on the card reader. Thus, only check-in scans are necessary. For

the distance-based AFC system, transit riders need to swipe their smart cards

during both check-in and check-out processes. Transit riders need to hold their

smart cards near the card reader device to complete transactions when entering

or exiting buses. Smart card can be used in Beijing subway system as well,

where passengers need to tap their smart card on top of fare gates during

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Transit Passenger Origin Inference Using Smart Card Data 5

entering and existing subway stations. Both boarding and alighting

information (time and location) are recorded by the fare gates. Although transit

smart card exhibits its superiority on its convenience and efficiency, there are

still the following issues to prevent transit agencies fully taking advantages of

smart card for operational purposes:

Passenger boarding and alighting information missing

Due to a design deficiency in the smart card scan system, the AFC system

on flat fare buses does not save any boarding location information, whereas

the AFC system stores boarding and alighting location, except for boarding

time information on distance-based fare buses. Key information stored in the

database includes smart card ID, route number, driver ID, transaction time,

remaining balance, transaction amount, boarding stop (only available for

distance-based fare buses), and alighting stop (only available for distance-

based fare buses).

Massive data sets

More than 16 million smart card transactions data are generated per day.

Among these transactions, 52% are from flat-rate bus riders. These smart card

transactions are scattered in a large-scale transit network with 52386 links and

43432 nodes as presented in figure 1:

Figure 1. Beijing Transit GIS Network.

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Xiaolei Ma and Yinhai Wang 6

Limited external data with poor quality

Only approximate 50% of transit vehicles in Beijing are equipped with

GPS devices for tracking. GPS data are periodically sent to the central server

at a pre-determined interval of 30 seconds. However, the collected GPS data

suffer from two major data quality issues: (1) vehicle direction information is

missing; (2) GPS points fluctuation (Lou, et al., 2009). Map matching

algorithms are needed to align the inaccurate GPS spatial records onto the road

network. In addition, most of transit routes are not designed to have fixed

schedules because of high ridership demands, and only certain routes with a

long distance or headway follow schedules at each stop (Chen, 2009). The

above characteristics of the Beijing AFC and AVL systems create more

challenges to process and mine useful information.

It is noteworthy that the AFC system used in Beijing is not a unique case.

Most cities in China also employ the similar AFC system where passengers

origin information is absent, such as Chongqing City (Gao and Wu, 2011),

Nanning City (Chen, 2009), Kunming City (Zhou et al., 2007). In other

developing countries, such as Brazil, AFC system does not record any

boarding location information as well (Farzin, 2008). Therefore, a solution for

passenger boarding and alighting information extraction is beneficial to those

transit agencies with imperfect SC data internationally.

TRANSIT PASSENGER ORIGIN INFERENCE

Because smart card readers in the flat-rate buses do not record passengers

boarding stops, it is desired to infer individual boarding location using smart

card transaction data. In this section, two primary approaches are presented to

achieve this goal. Approximately 50% transit vehicles are equipped with GPS

devices in Beijing entry-only AFC system. Therefore, a data fusion method

with GPS data, smart card data and GIS data is firstly developed to estimate

each buss arrival time at each stop and infer individual passengers boarding

stop. And then, for those buses without GIS devices, a Bayesian decision tree

algorithm is proposed to utilize smart card transaction time and apply

Bayesian inference theory to depict the likelihood of each possible boarding

stop. In order to expand the usability of proposed Bayesian decision tree

algorithm in large-scale datasets, Markov chain optimization is used to reduce

the algorithms computational complexity. Both two transit passenger origin

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Transit Passenger Origin Inference Using Smart Card Data 7

inference algorithms are validated using external data (e.g., on-board survey

data and GPS data).

Passenger Origin Inference with GPS Data

In the first step, a GPS-based arrival information inference algorithm is

presented to estimate the arrival time for each transit stop, and then, the

inferred stop-level arrival time will be matched with the timestamp recorded in

AFC system. The temporally closest smart card transaction record will be

assigned with each known stop ID. The logic flow chart is demonstrated in

Figure 2. The major data processing procedure will be detailed below.

Figure 2. Flow Chart for Passenger Origin Inference with GPS Data.

Bus Arrival Time Extraction

Three primary data sources are involved in the passenger information

extraction: vehicle GPS data; transit stop spatial location data; and flat-fare-

based smart card transaction data. A transit GIS network contains the

geospatial location of each stop for any transit routes. The GPS device

mounted in the bus can record each buss location and timestamp every 30

seconds, but the data quality of collected GPS records is not satisfying: No

directional information is recorded in Beijing AVL system; GPS points are off

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Xiaolei Ma and Yinhai Wang 8

the roadway network due to the satellite signal fluctuation. Data preprocessing

is required prior to bus arrival time estimation. A program is written to parse

and import raw GPS data into a database in an automatic manner. Key fields

of a GPS record are shown in Table 1.

Table 1. Examples of GPS raw data

Vehicle ID Date time Latitude Longitude Spot speed Route ID

00034603
2010-04-07

09:28:57
39.73875 116.1355 9.07 00022

00034603
2010-04-07

09:29:27
39.73710 116.1358 14.26 00022

00034603
2010-04-07

09:29:58
39.73592 116.1357 19.63 00022

00034603
2010-04-07

09:30:28
39.73479 116.1357 0 00022

00034603
2010-04-07

09:30:58
39.73420 116.1357 3.52 00022

The first step is to estimate the bus arrival time for each stop by joining

GPS data and the stop-level geo-location data. A buffer area can be created

around each particular stop for a certain transit route using the GIS software.

Within this area, several GPS records are likely to be captured. However,

identifying the geospatially closest GPS record to each particular stop is

challenging since there could be a certain number of unknown directional GPS

records within the specified buffer zone. Thanks to the powerful geospatial

analysis function in GIS, each link (i.e., polyline) where each transit stop is

located is composed of both start node and end node, and this implies that the

directional information for each GPS record is able to infer by comparing the

link direction and the direction changes from two consecutive GPS records.

With the identified direction, the distance from each GPS point to this

particular stop can be calculated, and the timestamp with the minimum

distance will be regarded as the bus arrival time at the particular stop. Figure 2

visually demonstrates the above algorithm procedure. Inbound stop represents

the physical location of a particular transit stop, and this stop is snapped to a

transit link, whose direction is regulated by both a start node and an end node.

By comparing the driving direction from GPS records with the link direction,

the nearest GPS records to this particular stop can be identified, and marked by

the red five-pointed star on the map. The timestamp associated with this five-

pointed star will be considered as the arrival time for this inbound stop. The

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Transit Passenger Origin Inference Using Smart Card Data 9

merit of the bus arrival time estimation algorithm lies in its efficiency. Rather

than searching all the GPS data to identify the traveling direction for each stop,

the proposed algorithm shrinks down the searching area, and filters out those

unlikely GPS data. The operation greatly alleviates the computational burden,

and is relatively easy to implement in the large-scale datasets, which is

particularly critical to process the tremendous amount of datasets within an

acceptable time period.

Figure 3. Boarding Time Estimation with GPS Data and Transit Stop Location Data.

Passenger Boarding Location Identification with Smart Card Data

For each smart card data transaction record, the boarding stop can be

estimated by matching the recorded timestamp and the identified bus arrival

time. As presented in Figure 4, for each smart card transaction record, the

transaction time is compared with the inferred bus arrival time at each stop.

This record will be assigned to a particular stop where the bus arrival time is

the most temporally closed with its transaction time. Since passengers begin to

embark the bus at a relative short time interval, this data fusion method is able

to capture almost all missing boarding stops.

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Xiaolei Ma and Yinhai Wang 10

Figure 4. Boarding Stop Identification with Bus Arrival Time.

In addition, because all the arrival time for all stops of a particular transit

route can be estimated, the average travel time between two adjacent stops can

be calculated as well. This speed statistics is not only critical for transit

performance measures, but also provides prior information for passenger

origin inference when GPS data are absent.

Validation

Compared with bus arrival time, door opening time can be more

accurately matched with smart card transaction time. This is because each bus

may not exactly stop at each transit stop for passenger boarding. The inferred

bus arrival time is subject to incur errors when it is used to match with smart

card data. To validate the accuracy of the proposed data fusion algorithm for

passenger origin inference, on-board transit survey was undertaken to collect

bus door opening time and arrival location for each stop of route 651 on

January, 13th, 2013. Hand holding GPS devices were used to track the

geospatial location of moving buses every 15 seconds. The survey duration

was from 8:00 AM to 1: 00 PM, and a total of 75 bus door opening time was

manually recorded. These bus door opening time records were then compared

with smart card transactions from 417 passengers, and these estimated stops

can be considered as the ground-truth data. By comparing the ground-truth

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Transit Passenger Origin Inference Using Smart Card Data 11

data with the results from the proposed GPS data fusion approach, 406

boarding stops were accurately inferred and 11 boarding stops differ from the

ground-truth data within one-stop-error range. The proposed algorithm

demonstrates its accuracy as high as 97.4%.

Passenger Origin Inference with Smart Card Data

There are still a fair amount of buses without GPS devices, and thus the

bus arrival time at each transit stop is not directly measured. However, most

passengers scan their cards immediately when boarding and almost all

passengers should complete the check-in scan before arriving to the next stop.

This indicates that the first passengers transaction time can be safely assumed

as the group of passengers boarding time at the same stop. The challenge is

then to identify the bus location at the moment of the SC transaction so that we

can infer the onboard stop for that passenger. However, this is not easy

because the SC system for the flat-rate bus does not record bus location. We

know the time each transaction occurred on a bus of a particular route under

the operation of a particular driver, but nothing else is known from the SC

transaction database. Nonetheless, we are able to extract boarding volume

changes with time and passengers who made transfers. By mining these data

and combining transit route maps, we may be able to accomplish our goal.

Therefore, a two-step approach is designed for passenger origin data

extraction: smart card data clustering and transit stop recognition. To

implement the proposed algorithm in an efficient manner, a Markov Chain

based optimization approach is applied to reduce the computational

complexity.

Smart Card Data Clustering

Transaction Data Classification

First of all, we need to sort SC transactions by the transit vehicle number.

This results in a list of SC transactions in the vehicle for the entire period of

operations for each day. During the operational period, the vehicle may have

two to ten round-trip runs depending on the round-trip length and roadway

condition. At a terminal station, a transit vehicle may take a break or continue

running. So there is no obvious signal for the end of a trip (a trip is defined as

the journey from one terminus to the other terminus). Meanwhile, there are a

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Xiaolei Ma and Yinhai Wang 12

varying number of passengers at each stop, including some stops with no

passengers.

For stops with several passengers boarding, all transactions can be

classified into one group based on interval between their transactions. Thus,

the clustered SC transactions can be represented by a time series of check-in

passenger volumes at stops as shown in Table 2.

Table 2. Examples of Clustered SC transactions

Transaction

Cluster No.
Stop ID

Stop

Name

Total

Transactions

Transaction

Timestamp

Time

Difference

1 Unknown Unknown 18 5:26:36 0:14:26

2 Unknown Unknown 9 5:41:02 0:03:16

3 Unknown Unknown 11 5:44:18 0:04:35

4 Unknown Unknown 27 5:48:53 0:01:00

In Table 2, total transactions indicate the total boarding passengers in one

stop; transaction timestamp is recorded as the time when the first passenger

boards in this stop, and time difference means the elapsed time between the

boarding time at this stop and next stop with boarding passengers. Unlike most

entry-only AFC systems in the United States, stop name and ID from each

transaction are unknown in Beijings AFC system. Most buses in service

follow the predefined order of stops, however, it is still possible that there is

no passenger boarding in a specific stop, and thus two consecutive SC

transaction clusters do not necessarily correspond to two physically

consecutive stops. Obviously, this further complicates the situation and the

algorithm needed is indeed to map each cluster into the corresponding

boarding stop ID.

In summary, the smart card data clustering algorithm contains three steps

as follows:

Step 1: All transaction data for each bus are sorted by the transaction

timestamp in an ascending order.

Step 2: For two consecutive records, if their transaction time difference is

within 60 sec, then, these two transactions are included in one cluster;

otherwise, another cluster is initiated.

Step 3: If the transaction time difference for two consecutive records is

greater than 30 min or driver changing occurs, it is likely that the bus has

arrived in terminus, and for this bus, one bus trip has completed. Next record

will be the beginning for the next bus trip.

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Transit Passenger Origin Inference Using Smart Card Data 13

The result of the clustering process is several sequences of clustered

transactions. Each sequence may contain one or more trips of the transit

vehicle. For particular routes, due to the limited space in terminus or busy

transit schedule, bus layover time may be too short to be used as a separation

symbol for trips. Such buses may have a very long clustered sequence that

makes the pattern discovery process very challenging. Furthermore, unfamiliar

passengers or passengers boarding from the check-out doors (this happens for

very crowded buses) may take longer than 60 seconds to scan their cards. The

delayed transaction may cause cluster assignment errors. Again, this adds extra

challenge to the follow-up passenger origin extraction process.

Transaction Cluster Sequence Segmentation

Beijing has a huge transit network with nearly 1,000 routes. It is quite

common to see passengers transfer between transit routes. Through transfer

activity analysis, we can further segment the clustered transaction sequence

into shorter series to reduce the uncertainty in passenger OD estimation (Jang,

2010). Two key principles used in the transfer stop identification are:

(1) We assume the alighting stop in the previous route is spatially and

temporally the closest to the boarding stop for the next route. This is

reasonable because most passengers choose the closest stop for transit

transfer within a short period of time (Chu, 2008). Assume a

passenger k makes a transfer from route i to route j within n minutes.

If route i is a distance-based-rate bus line or a subway line, then we

can identify the transfer station that is also the boarding stop of route

j. Even if both routes are flat-rate bus routes, if the transferring

location is unique, we can still use the transfer information to identify

the transfer bus stop ID and name. In this study, the transfer time

duration n is 30 minutes, and the maximum distance between two

transfer stops is 300 meters.

(2) We assume that both the alighting time and the boarding time for each

particular stop is similar. In this case, we can substitute a passenger

boarding stop with another passenger alighting stop. Assume a

passenger k makes a transfer from route i to route j. If route j is a

subway line, where both its boarding location and time are available,

then we can estimate the passenger ks alighting stop of route i, and

this alighting stop can be also considered as the boarding stop for

those passengers who get on the bus at the same time.

EBSCOhost – printed on 10/28/2022 9:45 AM via UNIVERSITY OF THE CUMBERLANDS. All use subject to https://www.ebsco.com/terms-of-use

Xiaolei Ma and Yinhai Wang 14

Walk distance between the two stops should be taken into account for

inferring the time when the flat-rate bus arrives at the transfer stop. However,

several possible boarding stops may exist due to the unknown direction in the

flat-rate smart card transaction, and thus additional data mining techniques are

needed to find the boarding stop with the maximum likelihood. These data

mining techniques will be detailed in the next section.

Based on the identified transfer stops, we can further segment the

transaction cluster sequence into shorter cluster series. Each series is bounded

by either the termini or the identified bus stops. The segmented series of

transaction clusters will be used as the input for the subsequent transit stop

inference algorithm.

Data Mining for Transit Stop Recognition

Bayesian Decision Tree Inference

If we treat each segmented series of transaction cluster as an unknown

pattern, this unknown pattern can be considered as a sample of the sequential

stops on the bus route. If every stop has boarding passengers, this unknown

pattern is identical to the known bus stop sequence. Also, since distance and

speed limit between stops are known, travel time between stops is highly

predictable if there is no traffic jam. In reality, however, there may have

varying distribution of passengers boarding at any given stop and roadway

congestion may cost unpredictable delays. Therefore, the unknown pattern

recognition is a very challenging issue. Once the unknown pattern is

recognized, the boarding stop for any passenger becomes clear.

Bayesian decision tree algorithm is one of the widely used data mining

techniques for pattern recognition (Janssens et al.,

SHOW MORE…

WK1 STUDENT REPLIES

PLEASE SEE ATTACHMENT…….

STUDENT REPLIES

BELOW IS THE INSTUCTIONS ON HOW THE PROFESSOR WANTS THE STUDENT REPLIES ANSWERED BACK TO PLEASE FOLLOW THESE DIRECTIONS DOWN BELOW. AND IN YOUR STUDENT REPLIES DO NOT USE THE WORD
I AGREE OR NICE JOB
THE PROFESSOR DONT WANT TO SEE THOSE WORDS IN THE STUDENT REPLIES. ALSO, BOTH STUDENT REPLIES NEED TO BE 250 OR MORE WORD COUNT USING TEXTBOOK AND OUTSIDE REFERENCES REMEMBER YOU CAN PULL REFERENCES FROM THE ATTACHED READING THAT I PREVOUSLY ATTACHED FOR YOU THIS WEEK. THANK YOU AND LET ME KNOW IF YOU NEED ANYMORE CLARIFICATION ON THIS ASSIGNMENT…

MAKE SURE YOU PUT THE NAME WITH EACH STUDENT SO I WILL KNOW WHICH ONE GOES WITH WHO.

Respond to at least one of your colleagues’ postings. Respond in one or more of the following ways:

Ask 1 probing question.

Share an insight from having read your colleague’s posting.

Offer and support an opinion.

Validate an idea with your own experience.

Make a suggestion.

Expand on your colleague’s posting.

STUDENT REPLY #1Taylor Carrion

Danny Benson was a 15-year-old 4-foot-9 male attending George Warshaw High School he was pretty much to himself and shy and quiet. He spent most of his time in the library with his head in books. He did not really have many friends except for one kid named Tony Danielson. Due to him being a loner and to some people a nerd and small for his age he would get bullied and harassed by multiple classmates. He tried to get help on multiple occasions and the school was not much help their bullying protocols were not as good or strict. So, one day he figured he would plan to shoot up the school to get revenge on the classmates and staff who did not help him in his time of need and to show them he is not to be messed with or that he has the power. So, on Friday afternoon is when he got his revenge, he walked into school somehow getting by their blind spots so no one could see what happened and started to shoot the classmates while at lunch and then headed to the office and started to shoot the staff as well. He killed 15 and injured 26 people and then finally became surrounded by the police after trying to escape even after trying to shoot at the police as well he gave and killed himself as he already knows that was the best thing that he could do now as he was surrounded. Danny would be considered a mass murderer because he killed more than 3 people and went on a killing spree at that specific time in his school. He did this out of revenge for being bullied and harassed by classmates on multiple occasions. There were no patterns and no repeated shootings it was just one mass killing. He was a revenge-oriented mass killer under a subgroup of school shooters. In the PDF Mass Murderer (n.d.), Bullying and teasing is most likely the main motivation behind the students violence. He was to get back at others who teased and bullied him. It was also stated, This was Dannys case where he plotted his revenge based on the teasing and bullying of his former classmates and the staff that did not help as much.

Next, Vlad Kauffer was a middle-aged Caucasian man in his late 40s an intelligent guy who worked in finance at a high-end bank. From the start of his childhood specifically with his mom who was not around much but open his eyes to a very sexual life and would be out late nights with random strangers due to her being a prostitute causing Vlad to take care of himself at an incredibly early age and caused anger against his mom and woman who reminded him of his mom. When she was around, she was an alcoholic and was very domineering. Vlad would stalk young women prostitutes convince them in his car take them back to his place torture, rape, and then kill them, and would dispose of the bodies by cutting them into pieces and spreading the pieces out in various locations. He wanted to rid the world of prostitutes just like his mom. He would be considered a serial killer based on the fact of the patterns, I.e., serial occasions of specific profiles of victims and killings. This was multiple victims in different periods of time and on multiple compassions compared to one big mass occasion. He would be considered mission-oriented and out for revenge. As stated by Fox & Levin (2005), The motive of power and control encompasses what earlier typologies have termed the mission-oriented killer (Holmes 7 Deburger, 1988), whose crimes are designed to further a cause. Through killing, he claims an attempt to rid the world of filth and evil, such as by killing prostitutes or the homeless. He wanted control of his life and wanted to rid anyone in that field of women who were prostitutes that reminded him of his neglectful and alcoholic mom.

Reference

Fox, J. A., & Levin, J. (2005). Defining multiple murder. In Extreme killing: Understanding serial and mass murder (pp. 1525). Thousand Oaks, CA: Sage Publications, Inc. Extreme Killing: Understanding Serial and Mass Murder. Copyright 2005 by Sage Publications via the Copyright Clearance Center.

Mass Murder. (n.d.).
https://class.waldenu.edu/bbcswebdav/courses/USW1.17952.202310/Mass%20Murder.pdf

STUDENT REPLY #2 Nicole Holmes

A serial murderer is the killing of 2 or more victims by the same offender in separate events.

Jeffery Dahmer is an example of a serial murderer.

He started killing in 1978 at the age of 18.

Dahmer killed at least 17 people that authorities found out about.

John Wayne Gracy was another serial murderer who was convicted of 33 counts of murder, with one of his younger victims being as young as 15.

Jeffery Dahmer would lure young boys to his home, sometimes acting as if he wanted to be friends with them. Only to add to his body count.

Some say he would sodomize the deceased victims.

Dahmer can be labeled as a serial murderer because he killed multiple people at different times in separate events. Dahmer also fits the description of a serial murderer due to the cooling-off periods he had. There were no killings known about on a daily basis.

Of course, profiles are not suitable in all cases, even in some murder cases (Holmes & Holmes, 1992, 2000). They are usually more efficacious in cases where the unknown perpetrator has displayed indications of psychopathology (Geberth, 2006; Holmes & Holmes, 2000). Crimes most appropriate for psychological profiling are those where discernable patterns are able to be deciphered from the crime scene or where the fantasy/motive of the perpetrator is readily apparent.

I think this is a great example of the Jeffery Dahmer case. His documentary shows that Dahmer first started his unusual acts with a manikin.

What differentiates mass murder from serial is also the timing and number of murders. Serial killers commit murder over long periods of time. Sometimes in different locations like that of Dahmer.

His killings were based on sexual homicide and sadistic sexual assaults.

While mass murders kill within a single time frame.

Serial killers differ in their motives for killing. Dahmer was a visionary in his plot for murder.

Reference

Holmes, R. and Holmes, S. (2008) Profiling Violent Crimes. 4th edn. SAGE Publications. PROFESSOR REPLY

PLEASE ANSWER THE PROFESSOR QUESTION BELOW BASED OFF OF YOUR WEEK 1 DISCUSSION THATS BELOW

Serial Killer and Mass Murderer

A serial killer entails a person who assassinates three or more individuals within a period exceeding a month, with resting time between murders. In this case, the murders are separate events that result from a psychological pleasure or thrill (Holmes & Holmes, 2009). Serial killers lack guilt and empathy, becoming egocentric individuals. The killers remain psychologically motivated and organized to commit murder. Serial killers employ a sanity mask to appear charming and ordinary while hiding their actual psychopathic tendencies. For instance, Ted Bundy was an appealing serial killer who methodically planned out murder (Stone, 2019). He would fake injuries to seem harmless to victims. He committed about thirty murders between 1974 and 1978 before his capture.
Mass murderers slay many people, usually at the same time, within a single location. For instance, James Holmes attacked and shot at a Colorado movie theater (Allely, 2020). As a result, he injured fifty-eight people and twelve individuals, making him a mass murderer. A psychiatry professor from Columbia argues that mass murderers comprise dissatisfied people with few friends and poor social skills. Generally, mass murderers motives are less apparent compared to serial killers. Professor Stone claims that males facilitate most mass murder cases, with most of them lacking clinical psychotic. Instead of remaining a sociopath like serial killers, mass murderers are distrustful persons with acute social and behavioral syndromes. Comparable to serial assassins, mass murderers exhibit psychopathic inclinations, including being uncompassionate, cruel, and manipulative. Nevertheless, most mass assassins are loners or social nonconformists whose actions result from triggers by some overpowering events.
Generally, mass murderers and serial killers often demonstrate similar manipulation characteristics and lack of empathy. Factors that distinguish the two involve the sum of murders as well as timing. Mass murderers assassinate people in a single time frame and location. On the other hand, serial killers often murder in different places and over a long period.

References

Allely, C. S. (2020). The contributory role of psychopathology and inhibitory control in the case of mass shooter James Holmes. Aggression and violent behavior, 51, 101382.
Holmes, R. M., & Holmes, S. T. (2009). Profiling violent crimes: An investigative tool (4th ed.). Thousand Oaks, CA: Sage Publications, Inc.
Stone, M. H. (2019). The place of psychopathy along the spectrum of negative personality types. In Psychoanalysts, psychologists and psychiatrists discuss psychopathy and human evil (pp. 82-105). Routledge.

PROFESSOR REPLY QUESTION

GO BACK TO WK1 ALL ATTACHED READING TO ANSWER BACK TO THIS QUESTION

In Chapter 1 of your textbook the author lists several crimes that are most suitable for the profiling. What are some of these crimes? Week 1 Test for Understanding

This 10-question, objective Test for Understanding will assess how well you understand and can apply the information in this week’s Learning Resources.

To prepare for the Test for Understanding:

Review the assigned Learning Resources.
About the Test for Understanding:

PLEASE HIGHLIGHT THE CORRECT ANSWER IN RED AND IF YOU GO BACK TO ALL THE READING, I HAVE POSTED FOR WEEK 1 IN THE LAST 2 ASSIGNMENT YOU SHOULD BE ABLE TO FIND THESE ANSWERS

QUESTION 1

As a result of a sexual fantasy, a man kills a series of women over a period of time to demonstrate his control over the victims. What is most likely the motivating reason for this murder?

Terror

Loyalty

Revenge

Power

QUESTION 2

The biggest difference between serial murderers and mass murderers is:

The number of victims

The lapse in time in between killings

The motivation for the killings

The selection of the victims

QUESTION 3

Suppose a criminal profiler is assisting law enforcement in the interrogation of potential suspects. This would most likely be an example of which of the three major goals of criminal profilers?

To provide the criminal justice system with a social and psychological assessment of the offender

To provide the criminal justice system with a psychological evaluation of the belongings found in the possession of the offender

To provide interviewing suggestions and strategies

To provide the criminal justice system with a hypothesis about where the potential serial murderer lives

QUESTION 4

A man kills seven people at his place of employment and then takes his own life. He would be considered a ___________.

mass murderer

serial murderer

spree murderer

suicide murderer

QUESTION 5

Typologies of mass and serial murderers are useful in constructing a profile of a murderer. In general, one of the most important characteristics of a crime scene in determining the type of murderer is __________.

the time of the murder

the victims’ characteristics

the weapons used in the killing

the location of the bodies

QUESTION 6

The serial killer Dennis Rader, known as the BTK Strangler, was unique among serial killers because:

He killed mostly women.

He killed some of the victims in their homes.

He killed victims near his place of residence.

Some of his killings were separated by long periods of time.

QUESTION 7

Inductive reasoning and deductive reasoning are the two major ways criminal profilers construct profiles of potential suspects. With ______________ logic, a criminal profiler conducts a thorough analysis of a crime scene and then based on the analysis, constructs an image of the unknown murderer.

inductive

deductive

inductive and deductive

deducible

QUESTION 8

Proponents of criminal profiling recognize that profiling is part art, but they also recognize that it is grounded in science. Which of the following statements reveals the science aspect of criminal profiling?

Criminal profilers use their intuition to create profiles.

Criminal profilers rely on their hunches to create profiles.

Criminal profilers rely on empirical research from criminology, sociology, and psychiatry to create profiles.

Criminal profilers rely on guesswork to create profiles.

QUESTION 9

Advocates of criminal profiling recognize that criminal profiling is appropriate for which of the following types of crimes?

Sexual homicide

Child molestation

Armed robbery

Burglary

QUESTION 10

Using typologies is at times difficult because not all serial and mass murderers fall neatly into one typology or another. Assume you have to classify the Virginia Tech killer into one particular type of mass murderer. All that you know about the killer is that he had a stockpile of weapons at his disposal. What type of mass murderer is the Virginia Tech killer?

The family annihilator

The disgruntled employee

The disciple

The pseudocommando 10/15/22, 5:24 PM SafeAssign Originality Report

https://class.waldenu.edu/webapps/mdb-sa-BBLEARN/originalityReportPrint?course_id=_17013007_1&paperId=5938498949&&attemptId=77dd46c7-2c5d-b92a-e149-d3f3383ccced&course_id=_17013007_1 1/8

USW1.17952.202310 – CRJS-3010-1-PROF SERIAL AND MASS MURD-2022-FALL-QTR-TERM-WKS-7-THRU-12-(10/10/2022-11/20/2022)-PT5

Assignment – Week 1
Jennifer Green
on Sat, Oct 15 2022, 6:14 PM
100% highest match
Submission ID: 77dd46c7-2c5d-b92a-e149-d3f3383ccced

Attachments (1)

WK1ASSGN GREEN J.docx

2

Criminal Profiling

Jennifer Green Walden University CRJS 3010 – 1

Brent Paterline

October 15, 2022

Historical Influences in Profiling

1 CRIMINAL PROFILING IS CRUCIAL WHEN LOOKING INTO CRIME SCENES. CRIMINAL

PROFILING HAS BEEN EVOLVING MORE AND MORE OVER TIME. CRIMINAL PROFILING IN

THE MODERN ERA IS MORE SCIENCE THAN ART. MOST MODERN PROFILING

TECHNIQUES ARE BASED ON SCIENCE. THEY DIFFER SIGNIFICANTLY FROM THE WAYS IN

WHICH PEOPLE ARE CURRENTLY PROFILED. HOWEVER, THE EVOLUTION OF THE

PRESENT-DAY PROFILING METHODS WAS IMPACTED BY THE PAST PROFILING

PRACTICES. 2 ONE OF THE CURRENT PROFILING TECHNIQUES, FOR INSTANCE, IS THE

GATHERING OF MEDICAL EVIDENCE. 1 IN THIS, A MEDICAL EXPERT IS HIRED TO

IDENTIFY THE OFFENDER’S PSYCHOLOGICAL AND BEHAVIORAL TRAITS (FRANCESE,

2019). TRADITIONAL PROFILING METHODS INCLUDED BEHAVIORAL PROFILING. IT IS

THE ONE THAT HAD AN IMPACT ON THE PRACTICE OF ACQUIRING CLINICAL EVIDENCE

PROFILING. FURTHERMORE, THE CURRENT METHODS OF PROFILING ARE GUIDED BY

THE CRITERIA FOR HISTORICAL PROFILING. Consider the idea that criminals may possess certain

physical traits. 2 THESE TRAITS ARE USED TO CATEGORIZE CRIMINALS INTO DIFFERENT

GROUPS (CHIFFLET, 2015). 1 THESE TRAITS CAN BE USED BY THE CRIMINAL PROFILER

TO DISTINGUISH BETWEEN MASS MURDERERS AND SERIAL KILLERS. 2 IN ORDER TO

IDENTIFY CRIMINAL PROPENSITIES, HISTORICAL PROFILING INCLUDED ASSESSMENTS

OF PERSONALITY AND MENTAL CAPACITY. 1 WITH THE EXCEPTION OF APPLYING

SCIENTIFIC TECHNIQUES, THE ASSESSMENT IS CARRIED OUT IN ACCORDANCE WITH

EXISTING PROFILING STANDARDS.

WK1ASSGN GREEN J.docx
Word Count: 664
Attachment ID: 5938498949

100%

http://safeassign.blackboard.com/

Highlight

10/15/22, 5:24 PM SafeAssign Originality Report

https://class.waldenu.edu/webapps/mdb-sa-BBLEARN/originalityReportPrint?course_id=_17013007_1&paperId=5938498949&&attemptId=77dd46c7-2c5d-b92a-e149-d3f3383ccced&course_id=_17013007_1 2/8

2 ROLES AND RESPONSIBILITIES OF PROFILERS

CRIMINAL PROFILERS PLAY A CRITICAL ROLE IN CRIMINAL INVESTIGATIONS. THEY

HELP LAW ENFORCEMENT PROFESSIONALS TO APPREHEND OFFENDERS. THE CRIMINAL

PROFILES THEY DEVELOP MAKE ARRESTING THE CRIMINALS EASIER. IN DOING THIS,

CRIMINAL PROFILER PERFORMS MANY ROLES AND RESPONSIBILITIES. FIRST, THEY

REVIEW THE EVIDENCE LYING AT THE CRIME SCENES. THEY REVIEW IT TO GET MANY

DETAILS ABOUT THE CRIME. THIS ALSO HELPS TO IDENTIFY THE CRIMINAL BEHAVIOR

OF THE CRIMINALS (FRANCESE, 2019). SECONDLY, THEY FIND AS MUCH INFORMATION

AS THEY CAN ABOUT SUSPECTS. THEY USE THIS INFORMATION TO DEVELOP A CRIMINAL

PROFILE. THIRDLY, CRIMINAL PROFILERS STUDY THE CRIME SCENE TO DETERMINE

THE BEHAVIOR PATTERNS OF THE OFFENDERS (HOLMES & HOLMES 2009).

CONSEQUENTLY, THEY WRITE THE REPORT, COMPILE DATA AND MAKE CONCLUSIONS

THAT THEY PROVIDE IN COURTS AS TESTIMONY. LAST BUT NOT LEAST, THESE

PROFESSIONALS ADVISE THE LAW ENFORCEMENT PROFESSIONALS ON THE

TECHNIQUES THEY SHOULD USE TO PURSUE CRIMINALS.

DETECTION AND APPREHENSION OF SERIAL AND MASS MURDERERS

Through their roles, goals, and responsibilities, criminal profilers help in the detection and apprehension of

criminals. 2 HOWEVER, HOW THEY DO THIS MIGHT AID THE APPREHENSION OF SERIAL

KILLERS AND MASS MURDERERS. IF THEY DEVELOP VAGUE CRIMINAL PROFILES FOR

SERIAL KILLERS AND MASS MURDERS, IT WILL BE HARD TO MAKE APPREHENSION.

WITH SUCH A PROFILE, LAW ENFORCEMENT OFFICERS CANNOT EASILY NARROW DOWN

THE LIST OF SUSPECTS TO IDENTIFY THE SUSPECT WHO COMMITTED THE MASS

MURDER OR A SERIES OF MURDERS (HOLMES & HOLMES 2009). MOREOVER, THEIR

CREATION OF A PROFILE WITH SCANT INFORMATION CAN SLOW THE APPREHENSION OF

SERIAL KILLERS AND MASS MURDERS. FOR EXAMPLE, IF THE PROFILE USES

PSYCHOLOGICAL DESCRIPTIONS ONLY WITHOUT PHYSICAL DESCRIPTIONS LIKE

HEIGHT, IT MAY BE DIFFICULT FOR THE POLICE TO APPREHEND THE SUSPECT.

HOWEVER, DEVELOPING A DETAILED PROFILE WITH SPECIFIC INFORMATION WILL AID

THE APPREHENSION OF THESE CRIMINALS. SUCH A PROFILE CONSISTS OF BEHAVIORAL,

PSYCHOLOGICAL, AND PHYSICAL INFORMATION ABOUT THE CRIMINALS (FRANCESE,

2019). PROFILERS WHO DO IN-DEPTH RESEARCH ABOUT SERIAL CRIMINALS AND MASS

MURDERERS WILL BE ABLE TO GIVE THIS INFORMATION IN A MORE DETAILED WAY.

THE MORE THE INFORMATION IS AVAILABLE, THE HIGHER THE CHANCES OF

DETECTING THE SUSPECTS AND APPREHENDING THEM.

References

Chifflet, P. (2015). 2 QUESTIONING THE VALIDITY OF CRIMINAL PROFILING: AN EVIDENCE-

BASED APPROACH. AUSTRALIAN & NEW ZEALAND JOURNAL OF CRIMINOLOGY, 48(2),

238255. HTTPS://DOI.ORG/10.1177/0004865814530732 HOLMES, R. 1 M., & HOLMES, S. T.

(2009). 3 PROFILING VIOLENT CRIMES: AN INVESTIGATIVE TOOL. Sage Publications.

3 THOUSAND OAKS, CA: 1 SAGE PUBLICATIONS, INC.FRANCESE, S. (2019). 2 CRIMINAL

PROFILING THROUGH MALDI MS BASED TECHNOLOGIES – BREAKING BARRIERS

  

Leave a Reply

Your email address will not be published.

Related Post

Chapter 4 updateChapter 4 update

  8 Abstract Comment by Northcentral University: The abstract should be included in the dissertation manuscript only. It should not be included in the dissertation proposal. Don't use plagiarized sources. Get

READ MOREREAD MORE
Open chat
💬 Need help?
Hello 👋
Can we help you?