Personal cloud storage services are data-intensive applications already producing a big share of Internet traffic. Several solutions offered by different companies attract more and more people. However, little is understood about each service capabilities, architecture and – most of all – performance implications of design choices. This paper presents a methodology to review cloud storage services. We apply our methodology to match 5 popular offers, revealing different system architectures and capabilities. The implications on performance of various designs are assessed executing a series
Personal cloud storage services are data-intensive applications already producing a big share of Internet traffic. Several solutions offered by different companies attract more and more people. However, little is understood about each service capabilities, architecture and – most of all – performance implications of design choices. This paper presents a methodology to review cloud storage services. We apply our methodology to match 5 popular offers, revealing different system architectures and capabilities. The implications on performance of various designs are assessed executing a series of benchmarks. Our results show no clear winner, with all services affected by some limitations or having potential for improvement.
Personal cloud storage services allow synchronizing local folders with servers in the cloud. They need gained popularity, with companies offering significant amounts of remote storage for free of charge or reduced prices. More and more people are being attracted by these offers, saving personal files, synchronizing devices and sharing content with great simplicity. This high public interest pushed various providers to enter the cloud storage market.
The results extend, where Dropbox usage is analyzed from passive measurements. In contrast to the previous work and that specialise in a selected service, this paper compares several solutions using active measurements. The results are used to guide the benchmarking definition. The authors of benchmark cloud providers, but focusing only on server infrastructure. Similarly to our goal, evaluates Dropbox, Mozy, Carbonite and CrashPlan. Motivated by the extensive list of providers, they first propose a strategy to automate the benchmarking. Then, we analyze several synchronization scenarios and providers, shedding light on the impact of design choices on performance.
The results reveal interesting insights, like unexpected drops in performance in common scenarios due to both the lack of client capabilities and architectural differences in the services. Overall, the teachings learned are useful as guidelines to enhance personal cloud storage services.
The goal of this paper is twofold. Firstly, it is investigated how different providers tackle the matter of synchronizing people’s files. For answering this question, they develop a methodology that helps to know both system architecture and client capabilities. They examine their methodology and compare 5 services, revealing differences on client software, synchronization protocols and data center placement. Taking the attitude of users connected from a single location in Europe, the benchmark select service under an equivalent conditions, highlighting differences manifested in various usage scenarios and emphasizing the relevance of design choices for both users and therefore the Internet.
Methodology and Services
This section describes the methodology we follow to style benchmarks to see capabilities and performance of personal storage services. We use active measurements counting on a testbed.
Architecture and Data Centers
It builds the testbed during a single Linux server for our experiments. The Linux server both controls the experiments and hosts a virtual machine that runs the test computer (Windows 7 Enterprise). The testbed is connected to a 1 GB/s Ethernet network at the University of Twente, in which Internet connectivity isn't a bottleneck. The used architecture, data center locations and data center owner are important aspects of private cloud storage, having both legal and performance implications.
Checking Capabilities
The owners of the IP addresses are identified using the service. For every IP address, they glance for the geographic location of the server. Since popular geolocation databases are known to possess serious limitations regarding cloud providers, they believe a hybrid methodology that makes use of: (i) informative strings (i.e., International Airport Codes) revealed by reverse DNS lookup; (ii) the shortest trip Time (RTT) to PlanetLab nodes [15]; and
(iii) active traceroute to identify the closest well-known location of a router. Previous works indicate that these methodologies provide an estimation with a few hundred of kilometers of precision, which is sufficient for our goals.
Results show that up to 90 you look after Dropbox users’ uploads carry but 1 MB. While 50 you look after the batches carry only one file, a big portion (around 10%) involves a minimum of 100 files. supported these results, the design of benchmarks varying
(i) number of files;
(ii) file sizes
(iii) file types, therefore covering a spread of synchronization scenarios.
All files in the sets are created at run-time by our testing application. Other sorts of files, utilized in Sect. 4 for checking capabilities, are not included within the benchmarks for the sake of space.
Synchronization startup, upload time and protocol overhead are discussed. It’s important to strengthen that each one's measurements are taken from one location, in the same controlled environment. While results for every service may vary when measuring from other locations or longer intervals, the conclusions in the following are independent of that. Repeating the experiments from different locations is planned for future works.
Tested Storage Services
This is specialised in 5 services for the sake of space, although our methodology is generic and may be applied to the other service. It is restricted analysis to local clients, since previous results show that this is often the largely preferred means to use personal cloud storage services.
Dropbox, Google Drive, and SkyDrive are selected because they're among the foremost popular offers, consistent with the quantity of search queries containing names of cloud storage services on Google Trends. Their identification is trivial by monitoring the traffic exchanged when the client
(i) starts;
(ii) is idle; and
(iii) synchronizes files.
Both server names and IP addresses can be used to identify different operations during our tests. For Wuala, they use flow sizes and connection sequences to identify storage flows. They notice some relevant differences among applications during login and idle phases. The reports the cumulative number of bytes exchanged with control servers considering an initial 16 min in idle state. Two considerations hold.