A Pricing Scheme for the Task “ Take Photos to Make Money ”

In this paper, we first use the multiple stepwise regression to fit the price function and analyze the reason of the unfinished task from many angles. Then, by using the idea of "optimal neighbor", the matching model of task and member is established to optimize the original pricing model. Then, starting from the "small world" model, the actual packaging situation was simulated, and the price correction function was defined to reduce the impact of task distribution density on the completion situation, and the pricing scheme was further optimized. Finally, the new project is priced with an optimal package pricing formula, and the implementation effect is judged from the two perspectives of task completion and simulation results.

tasks, or in the reasons for unfinished tasks, the location of tasks always has a direct or indirect relationship with the pricing.In order to design a better pricing scheme, the new pricing scheme is designed based on the location relationship between tasks and members.
Finally, in the case of a concentrated task distribution, the pricing model when the task is packaged and released is considered and evaluated.It can be seen from the second part that the user pursues unit distance gain when completing the task.When the task distribution is more concentrated, even if the benefit of a single task is lower, there will be higher benefit per unit distance due to the task concentration.In the same way, when the task is distributed, the benefit per unit distance is lower even if the single task has a high profit.To further improve the model.The pricing model should be improved for task distribution.

The Task and Member Distribution are Clustered
There are many data processing methods, such as interval convergence processing, standardization processing, clustering processing, etc., and appropriate data processing methods can be selected according to specific data.This paper adopts - Z score standardization method to process data.The steps of - Z score standardization processing are as follows: For the index value ( 1, 2, , ; 1, 2, ) ……, of m items of n evaluated objects, the new standardized index value is is the average value of the Jth index of n evaluation objects, and is the mean variance of the Jth index of n evaluated objects, that is, is the unquantified index value.The k-means method is used to cluster tasks and member distribution (Tingting Li, 2015).First, randomly choose K objects as the initial clustering center from the distribution coordinates and then loop.Then, the distance from each object to the central object is calculated according to the mean value of the central object, and the corresponding object is divided again according to the minimum distance.Finally, the mean value of each cluster is recalculated until the center of the cluster does not change, making the following formula minimal (Zhuo jinwu, 2014).
Where m i is the mean of the ith cluster and x i is the variable in the ith cluster, so the distance between the variables in the cluster and the mean should be minimized.The clustering results are as follows:

Spatial Distribution of Task Pricing
With the help of surfer8.0 software, the spatial distribution of the task price in annex 1 is described, so as to obtain the spatial distribution diagram of the pricing of each task.The results are as follows： In the figure above, most areas are relatively light in color, indicating that the area between 65 and 75 accounts for the majority of the task pricing, while the darker areas with higher pricing are mostly distributed in sheet form, and the task pricing near the darker areas is also relatively high, while the continuous areas are generally the same or similar in price.
From the actual analysis, the location of the task determines whether the platform members need to spend a certain amount of time and energy to arrive and complete, and the tasks that require members to spend more energy to complete are often located in remote locations, so the pricing of such tasks should be improved to motivate members to complete the task.
To sum up, there is a close relationship between task pricing and task location.Therefore, the longitude and latitude of the location coordinate level task point is used as the variable influencing the pricing.

Model of Task Pricing
According to the above analysis, the price of the task is closely related to the location of the task, so the longitude and latitude of the task point can be taken as the independent variable affecting the price.
In addition, the task price far away from the member location should be increased accordingly.When the price increase of a task does not match the distance increase, the task may not be completed.In order to motivate members to complete tasks, the price should be increased when the distance between tasks increases.Based on this, the p-d (price-distance) matching factor is defined as the third variable influencing the price.The matching factor is used to measure the matching degree of the change of task distance and the change of task list price.
Only when the change degree of the two is consistent can the task be accepted and completed by members.
Based on k-means clustering analysis, the tasks in annex 1 are classified into four categories according to the coordinates (figure 2-1), and the average price of each type of task is calculated.
The ratio of the price and the clustering average price of each specific task to the price and the average price of all tasks is defined as the degree of price change, that is The task location to the member center distance clustering and the ratio of the sum of all the tasks to the member center distance is defined as the distance change, due to the topic given data point location change is very small, You can almost view it as a plane, where the Euclidean distance is calculated to measure the distance between two points.That is The ratio between the degree of price change and the degree of distance change is defined as the p-d interaction factor, which is expressed by the formula (1) To the degree of the degree of price change and the distance change using z -score method standardization, to eliminate the influence of the dimension, after processing of numerical plug (1) calculate the each task points matching values, based on the above analysis, define the task for the multivariate linear regression equation model is established, with task price as the dependent variable, the longitude, latitude, and the matching factor for the multiple linear regression equation of independent variables can be obtained as follows: Where La is the latitude of a task point, Lo is the longitude of the task point, and PD is the p-d interaction factor.
There are some mutant values in the p-d interaction factors.Although it may be reasonable to consider the occurrence of these values from the actual situation, it will affect the fitting effect of the model.Therefore, these singular data should be removed when fitting the equation (annex 2).In the previous 90% or 742 data, multiple regression was solved by matlab and the optimized regression equation can be obtained by using the regression toolbox in matlab 199.5307 3.41324 1.6699 0.08189

Model of the Factors Influencing the Complexity of the Task
Besides the distance of the task and the matching degree of the price, the difficulty of completing the task also affects the setting of the price.The harder the task is, the higher the price should be.According to the conclusion that the task is more difficult to complete when it is farther away from the member location, this paper measures the difficulty of completing the task by the distance between the task and the member center, and takes the difficulty as an influence factor to indicate the influence of the difficulty of the task on the pricing.

Distance from cluster center for task i
The difficulty level of task i Average distance from all points to cluster center


In terms of the formula: Is the distance from the cluster center for task i.
Under the condition of considering the impact of task difficulty on pricing, the model of pricing can be obtained as (2) The effect of equation ( 2) is tested with the remaining 10% data as above, and the average value of prediction error is 6.10% (annex 2).Compared with the model without the influence factor, the error value of the model is reduced to a certain degree by 6.94%.Therefore, it can be judged that the effect of the pricing rule is better after adding the influence factor of the difficulty of the task.

The Cause of the Unfinished Task
According to annex I, the project has a total of 834 tasks, among which 521 tasks have been completed, with a completion rate of 62.47%.There are a certain number of unfinished tasks.According to the above analysis, the location of a task has the greatest impact on the completion status.Secondly, the price level also determines whether a task is attractive.The matching degree of price-distance also affects the members' willingness to accept a task to some extent.Finally, the limit of the member's acceptance of the task and the credibility of the member will also affect the completion of the task.Therefore, the reasons for unfinished tasks are analyzed from the following four perspectives: (1) Geographical location The clustering results are drawn according to the distribution of completed and unfinished tasks.From the completion degree of different regions, it can be concluded that there is a great relationship between task completion and geographical location.At the same time, the coefficients of the two variables of longitude and latitude can be obtained from the regression equation are relatively large, so the pricing has a great relationship with longitude and latitude, namely the location, while the pricing can directly determine whether the members are willing to complete, so the geographic location has a great relationship with whether the task can be completed.

Figure 2.4. Task completion map
The red dots represent unfinished tasks, the green dots represent completed tasks, and the blue dots represent members' positions.
By common sense, the closer you get to a densely populated area, the easier it will be.Topic given the task of position in central guangdong province roughly, observe the task completion of the task distribution can be found that there are several regional degree is higher, respectively in longitude 113.3 ° ~ 113.6 ° and 113.6 ° ~ 113.8 ° and 113.9 ° and 114.2 ° range.Corresponding to the actual longitude, the region between these three longitude degrees is exactly three cities, namely guangzhou, dongguan and shenzhen.Therefore, the conclusion of higher completion degree of urban area can be obtained compared with other regions.
From a practical perspective, densely populated areas tend to be economically developed regions (Lina Wang & Aoran Xu, 2015), (Xiaorong Jiang & Shenglan Qang, 2017), which are more convenient to carry out various surveys and to reach the survey areas, so the number of people receiving tasks and the completion degree of 199.5307 3.41324 1.6699 0.08189 tasks are relatively high.Non-urban areas are relatively less complete due to the inconveniences in various aspects of the survey process.Therefore, the unfinished task may be caused by the unreasonable geographical location of the task.
(2) Reasons for price Price is the most important factor to determine whether a task can be accepted and completed.The task is divided into two groups according to its completion, and the price tag of the two groups is analyzed for difference.Since the two groups of data are numerical data and there is no correlation, k-w test is adopted to analyze the difference between the two groups of data.The test results showed that P value was less than 0.005 and the original hypothesis was rejected, so there was a significant difference between the two groups of prices.Therefore, the unfinished task may be caused by unreasonable price setting.
(3) Reasons for price-distance match According to the above analysis, when the task is located far away from the urban area, the difficulty will increase accordingly.At this point, the platform needs to increase the price of this task before anyone can accept it.When the distance of the task increases and the price does not increase accordingly, the task will be difficult to complete.Therefore, the price and location mismatch of the task will also make it difficult to complete.This factor is reflected in the P-D (price -distance) matching factor above.
(4) Reasons for membership As can be seen from the attachment given in the title, members with high reputation in the platform are preferred to start the task selection, and the task quota is larger.Theoretically, people with high credibility should accept more tasks, but in fact, the platform stipulates the scheduled task limit for everyone.As a result, some tasks cannot reach people with high credibility, and finally, some tasks may not be completed.
According to the observation in annex 2, there are a large number of members with low reputation.Although members with high reputation can choose tasks first, due to the limitation of the scheduled task limit and the difference in number, there will still be many tasks that are selected by members with low reputation, resulting in the failure to complete the task (Huang limei, 2016).
The number and distribution of members will also lead to the low degree of task completion.In large cities like guangzhou and shenzhen, the number of members will far exceed the number of tasks, so there will be member competition for tasks and member resource waste with low reputation.In non-large cities such as dongguan, the number and distribution of members are well matched with the task, and there will be no waste or competition, which may be the reason why the task completion level in dongguan is much higher than that in guangdong and shenzhen.

Improvement Direction of the New Scheme
First, from the regression equation in question (1), it can be known that the pricing and location of the task have a direct relationship, and the p-d interaction factor also reflects the relationship between location and price.The influencing factors also measure the difficulty of the task from its position.Therefore, it can be judged that the main reason why the task cannot be completed is that the task is located at a far distance and is difficult to complete.Although the price has been improved to a certain extent, it is still not enough to motivate members to complete the task.
Therefore, from the perspective of task location, draw a circle with task coordinates as the center.According to the principle of nearby assignment, find the minimum radius to reasonably allocate each task.The longer the radius is, the more difficult it is to complete the task, so the price should be raised to make members accept the task.The task is priced according to the length of the radius.

Establishment of Dynamic Partition Pricing Scheme
Build a matching model centered on the task location, and match each task with the member nearest to him according to the principle of "optimal proximity".The principle of "proximity optimization" is to assign tasks to members closest to the task point within a circle of radius1 .
Since the data obtained from the appendix of the subject is the longitude and latitude coordinates of the task point, and the radius is the plane radius when the circle is drawn, it is necessary to convert the longitude and latitude to the plane distance.By referring to the data, the following method can be used to calculate the distance between any two points on the surface of the map 2 The Longitude and Latitude coordinates of point A are (LonA, LatA), and the Longitude and Latitude of point B are (LonB, LatB).According to the datum of Longitude of 0 degree Longitude, the value of Longitude of east Longitude, minus Longitude of west Longitude, 90-latitude of north Latitude and 90+Latitude of south Latitude are (MLonA, MLatA) and (MLonB, MLatB).Based on the triangulation, the following formula for calculating the distance between two points can be obtained: D is in kilometers.
The distance between the two points mentioned above can be correspondingly replaced by the distance formula 0 D(A、B) here.Thus, the radius of the neighborhood is a set of values.
Since the assignment is based on the proportion of the quota, So the data structure of the member is represented as ( )( ( ), ( ), ( ), 1, 2, 1877) V j Lo j La j PM j j  ……，

， ()
Vjrepresents the data structure of the th member, () Lo j is the Longitude for the j th member， () La j is the latitude for the j th member， () PM j is the percentage of the JTH member's scheduled task limit in the total task limit.
Corresponding to the member's data structure, the data structure of the task is represented as ( )( ( ), ( ), ( ), 1, 2, 835) T i Lo i La i PP i i  ……， ， () Tirepresents the data structure of task i， () PP i is the proportion of the price of task i in the total task price.
It can be predicted that the larger the radius is, the more difficult it is to find matching members, and a higher price should be defined for this task.According to the above formula of calculating spherical distance by longitude and latitude, the shortest distance between any two task points in annex I is 0.17km and the longest distance is 13.50km.Therefore, the shortest distance and the furthest distance are used as extreme values of the radius taken, and 40 radii are taken at intervals of 0.33km to test the minimum radius that each task point can match to the member one by one.For 0 K ( , 1, 2 40) kk …… and then define the radius of the tasks i th () Mat i matching degree, because of the task assignment is, in fact, according to the proportion of reserve quota when to, when the price of the i th a task in the price of all tasks in () PP i is higher than the proportion of greater than or equal to the first j th a member of the limitation of scheduled tasks in addition limitation when the proportion of () PM j ,can match, the matching degree is defined as 1, otherwise the matching degree is defined as 1.Use the formula expressed by： is the match degree of the ith task when the radius is According to the principle of "optimal proximity", the goal is to determine the price of a task point according to the minimum radius of a task point.Therefore, the programming model can be established with the minimum matching as the objective function, so as to obtain the minimum radius of each task point.The planning model is as follows: ki is the sequence number of the minimum radius taken for the task point i， () Mat i k ， is the match degree of the first task point at the i-th radius.
With matlab, the minimum radius corresponding to each task point in annex 1 can be obtained.After analysis, the minimum distance is 0.17km, which should correspond to the minimum price of 65 yuan, and the maximum distance of 13.50km, which should correspond to the maximum price of 85 yuan.Because the farther the distance, the higher the price, so the distance and the price one-to-one correspondence, can get the most suitable price that should be set for each task point in annex I. Limited by space, only part of the result is shown below.

Comparison of Pricing Schemes
The pricing of each task under the new pricing scheme is calculated above, and the task completion degree under the new scheme should be compared with the task completion degree under the original pricing rules to analyze whether the new scheme has effectively improved the situation that the task cannot be completed.Use surfer8.0 software to make the price distribution diagram under the new pricing scheme: Compared with the original price distribution diagram (figure 2-3), it can be seen that the dark areas are obviously increased, so the task price under the new scheme is improved compared with the old one.
Although the price under the new plan has been increased, there will still be members who think the price is low and do not finish the task.Therefore, it is necessary to establish a price discrimination value.When the task price is higher than the discrimination value, the task can be completed; otherwise, the task cannot be completed.This paper measures task price with unit distance gain.
In fact, the differentiating value of the price can be interpreted as the minimum and minimum that members are willing to complete the task.For completed tasks, the value of this differentiation is the ratio of the gain to the distance of the task with the maximum distance; for unfinished tasks, this critical value is the ratio of the gain to the distance of the task with the minimum distance.
If this condition holds, the distinction between completed and unfinished tasks should be similar.The differentiated price is 6.3 yuan /km after calculation.The formula is as follows: () CP i is the completion status of task i， r P is the actual unit distance gain， rm P is the differential value of the distance gain。 When the actual task price is higher than the differentiated value, the task can be completed, which is denoted by "1"; otherwise, the task cannot be completed, which is denoted by "0".This evaluation standard is used to compare the price of each task under the new pricing scheme with the price difference to obtain the degree of completion.A total of 835 missions were completed, with 565 completed tasks and a completion rate of 67.66%.Under the original pricing scheme, there were 835 tasks, of which 521 were completed, with a completion rate of 62.47%.Compared with the original pricing scheme, the completion degree of the new pricing scheme is increased by 8%, indicating that the new pricing scheme is effective.

Task Packaging Model
It can be seen from the above analysis that whether a member can complete the task depends largely on the income per unit distance when he completes the task.The more concentrated task distribution enables members to complete more tasks in one area, thus the benefit per unit distance will be improved; The distributed task distribution requires members to go to different regions to complete the task, which will reduce the income per unit distance of members, thus causing users to scramble to choose the tasks distributed together.To further optimize the model, we need to adjust the distributed intensive and dispersed tasks on the price, and minimize the impact of the location distribution on the task completion.

Task Packaging Model Based on "Small World"
It can be seen from the observation data that the amount of data given in the appendix of the question is large, and it is difficult to determine the packaging model directly from the density of original data analysis.After observation analysis, as to make the task of distribution of middle latitude for 23 ° ~ 23.2 °, longitude 113.2 ° ~ 113.4 ° area more reasonable distribution of tasks and membership on this article selects the area as a "small world", it is a small world and realistic factors, just model is relatively simple, easy to analysis.You can simulate realistic task packaging by analyzing small world models.In the small world, the grid is used to divide the small world into 5*5 regions.The results are as follows: Figure 4.1.The regional division diagram of the "small world" model Tasks in a grid are relatively concentrated and can be packaged, and tasks in a grid can be considered as a package.When not packaged, the task is independent, task of data structure for price} {the longitude, latitude, task, when in the grid data is packaged as a "package", after the package has become a task points set, the package of data structure is defined as {latitude and longitude, the package price, package contains singular}, parcel is belong to latitude and longitude of the center of the grid longitude and latitude.(seeannex iv for the specific values of each package data structure in the small world).In this way, each package is equivalent to a task point, so that we can consider the pricing model of combining tasks and distributing them together.

The Creation of a Price Correction Function
According to the above analysis, the packages with a relatively concentrated task distribution will lead to the selection of users, while the willingness of parcel users with a relatively dispersed task is not very strong.Considering this reason, the task is more dispersed task completion after packaging will become low, in order to reduce the effects of task distribution density of the results of packaging, need to adjust the price, different package for intensive task distribution to adjust prices, making as much as possible and complete the package can be member to choose.
As the title suggests, in principle, the more creditworthy a member is, the higher the priority of starting a task, and the higher the priority.High-priority members will give priority to tasks with high income per unit distance.As a result of delayed selection, members with low priority are often unable to select some tasks with high income.Over time, this phenomenon will lead to higher priority members with higher priority and less priority members will not be able to receive tasks.
Based on the above considerations, this paper defines a price correction factor to adjust the price.
Each grid represents a unit area.The more tasks within the unit area, the denser the task distribution within the package, and the less tasks within the unit area, the more dispersed the task distribution within the package.Under normal circumstances, the price of the package should be the sum of the price of each task within the package.In order to reduce the effect of task distribution density, the price of package should be reduced to reduce the choice willingness of members.For packages with distributed tasks, the price should be increased to encourage more members to choose.
The above idea of price correction is expressed in the mathematical equation, that is, when the number of tasks contained in the package is small, the price of the package should be greater than the value added directly to the price; when the number of tasks contained in the package is large, the price of the package will be less than the value added directly to the price.
According to the data structure wrapped in the small world, the function image is fitted as follows: As shown in the figure, if the singular number contained in the package is taken as the independent variable and the package price obtained by simple summation is the dependent variable, the blue line in the figure above can be obtained.Based on the above considerations, it can be seen from the mathematical knowledge that the growth rate of the logarithm function is decreasing with the independent variable, so the logarithm function is adopted to correct the price.By fitting the logarithmic function with data, the logarithm formula is = ln 200 yx  .It can be seen from the curve that when the singular number of the package is 12, the revised price of the package with less tasks increases to a certain extent, which will improve the hospital where members complete the package with less tasks.For packages with more than 12 tasks, the revised price will be reduced to a certain extent, which will reduce the number of members competing for such packages.Therefore, theoretically, the new plan effectively considers the impact of packaging on pricing and further optimizes the pricing model.

Model Comparison
Theoretically, the packaged pricing model takes into account more comprehensive factors, making the model more optimized.It is also necessary to analyze the effects of the model based on actual data.
Using matlab to solve the pricing under the new scheme, the package is equivalent to a single task in the previous problem.Based on the second part, the completion of the task is measured based on the idea of distinguishing values, and the change of completion degree is analyzed.The calculated completion rate of the task under this scheme is 72%, which is 4.34% higher than the original completion rate, indicating that the pricing scheme considering the packaging situation is better.

Advantages of the Model
(1) In the first part, the p-d interaction factor and the difficulty of the task were added to consider the impact on price, making the perspective more comprehensive.
(2) In the second part, the measurement index to evaluate the matching degree of the task is given, and the task is priced according to the measurement index, so the pricing is more accurate.
(3) In the third part, the analysis of the general situation of package pricing through the small-world model not only simplifies the analysis process, but also is representative.

Disadvantages of the Model
(1) Due to the small amount of data given in the annex, the impact of some factors cannot be specifically considered, leading to the inability to consider the factors affecting pricing more comprehensively.
(2) In practice, there may be some influences of personal preference and some random factors when members complete tasks, so there may be some deviations in the results obtained.
(3) In order to simplify the problem, when describing the correlation between factors, more empirical formulas were used instead of more in-depth analysis of the correlation between factors.

Improvement Direction of the Model
Given fewer data angles, more specific data from different perspectives should be collected when possible, such as members' personal preference, members' free time and the correlation between various factors, etc., so as to get a more accurate pricing scheme by pricing tasks from multiple perspectives.

Conclusion
In this paper, the price function is fitted by means of multiple stepwise regression, and the reasons for the unfinished task are analyzed, including the difficulty level of task completion, distance, the relationship between task quantity and population density and so on.Then, the model of matching task and member is established by using the idea of "optimal neighbor".Then, starting from the "small world" model, the actual packaging situation was simulated, and the price correction function was defined to reduce the impact of task distribution density on the completion situation, and the pricing scheme was further optimized.The train of thought to provide the service provides the reference, the company pricing on the task, should be mainly based on to the difficulty of the task, and the distance of the city, and population density on pricing, will be closer at the same time, ease similar tasks package, which can be profitable, and can improve the completion of tasks.

Figure 3 . 1 .
Figure 3.1.Pricing scenarios for different regional missions under the new programme

Figure 4
Figure 4.2.A fitting diagram of the package data Again using K -means method according to its location in the annex ii member into four classes (figure 2-2), clustering center of purple category is(22.91,22.91),clusteringcenter of the red category for(22.99,22.99),bluecategory of clustering center for(23.16,23.16),clusteringcenter of the green category for(22.64,114.07).
is the degree of price change， i P is the Price for task i， P Is the average price of the category to which task item belongs。

Table 3 .
2. Comparison of data indexes between the original scheme and the new scheme