Clustering Similar Text Elements and Finding Duplicates Sample Clauses

Clustering Similar Text Elements and Finding Duplicates. In software projects, especially open source projects, there are often issue trackers where users can submit bug reports and feature requests. Popular projects have hundreds or even thousands of open issues. Once a new issue is submitted, one of the project maintainer needs to check and mark duplicates. Such task is important as it helps to reduce the number of open issues to those that matter. Research has tackled this problem by introducing automated approaches for detecting duplicates [40], [18]. In addition, research had tried to understand whether duplicates are harmful or not. For example, Xxxxxxxxxx et al. [2] found that duplicates are created because previous reports lack information and that duplicates can add value by including more information. Sometimes, bug reports are only interesting for developers if they have a certain level of severity and if a xxxxxxxx xxxx of people are affected [28]. Therefore, there are approaches that group issues by their type (e.g. bug reports, feature requests) and then cluster them by their similarity (e.g., based on NLP metrics, such as tf-idf). These clusters need to reach a minimum size to be considered for the next release of the software [37]. The specific approach to issues and bug report clustering varies a lot based on context. For example, developers and technical savvy stakeholders attach the stack trace (i.e., information about the active subroutines of the program affected by the bug) to the report. Consequently, machine learning techniques leverage these information to group stack traces together, which are more structured than natural language text, and extrapolate similar reports [53]. Evaluated internally at Microsoft, this approach resulted in a F-measure of 0.88 and can facilitate diagnosis and prioritization of issues to be addressed. However, as the authors report [53], its efficacy in large scale open source projects. Similarly, Xxxxx et al. [54] augmented natural language processing features with execution information about the context in which the issue/bug was observed. They were able, training their model on the Firefox bug report dataset, to detect up to 93% of duplicate bugs (compared to the 72% of using natural language features alone). Clustering issue trackers items and bug reports according to their text element is useful to automatically generate summaries. To that end, Xxxxxxx et al. [55] clustered bug reports leveraging their conversational features1 such as the position of the s...
AutoNDA by SimpleDocs

Related to Clustering Similar Text Elements and Finding Duplicates

  • Unbundled Subloop Distribution (USLD) 2.8.2.1 The USLD facility is a dedicated transmission facility that BellSouth provides from an End User’s point of demarcation to a BellSouth cross-connect device. The BellSouth cross-connect device may be located within a remote terminal (RT) or a stand-alone cross-box in the field or in the equipment room of a building. The USLD media is a copper twisted pair that can be provisioned as a 2-wire or 4-wire facility. BellSouth will make available the following subloop distribution offerings where facilities exist: USLD – Voice Grade (USLD-VG) Unbundled Copper Subloop (UCSL) USLD – Intrabuilding Network Cable (USLD-INC (aka riser cable))

  • Emergency Action on Imports of Particular Products Where any product is being imported in such increased quantities and under such conditions as to cause, or threaten to cause:

  • COVID-19 Protocols Contractor will abide by all applicable COVID-19 protocols set forth in the District’s Reopening and COVID-19 Mitigation Plan and the safety guidelines for COVID-19 prevention established by the California Department of Public Health and the Ventura County Department of Public Health.

  • Reactive Power and Primary Frequency Response 9.6.1 Power Factor Design Criteria

  • Technical Standards Applicable to a Wind Generating Plant i. Low Voltage Ride-Through (LVRT) Capability A wind generating plant shall be able to remain online during voltage disturbances up to the time periods and associated voltage levels set forth in the standard below. The LVRT standard provides for a transition period standard and a post-transition period standard.

  • Unbundled Voice Loop – SL2 (UVL-SL2 Loops may be 2-wire or 4-wire circuits, shall have remote access test points, and will be designed with a DLR provided to NewPhone. SL2 circuits can be provisioned with loop start, ground start or reverse battery signaling. OC is provided as a standard feature on XX0 Xxxxx. The OC feature will allow NewPhone to coordinate the installation of the Loop with the disconnect of an existing customer’s service and/or number portability service. In these cases, BellSouth will perform the order conversion with standard order coordination at its discretion during normal work hours.

  • Unbundled Copper Loop – Designed (UCL-D) 2.4.2.1 The UCL-D will be provisioned as a dry copper twisted pair (2- or 4-wire) Loop that is unencumbered by any intervening equipment (e.g., filters, load coils, range extenders, digital loop carrier, or repeaters).

  • Unbundled Sub-Loop Distribution Intrabuilding Network Cable (USLD-INC) is the distribution facility owned or controlled by BellSouth inside a building or between buildings on the same property that is not separated by a public street or road. USLD-INC includes the facility from the cross connect device in the building equipment room up to and including the point of demarcation at the End User’s premises.

  • Commingling of Resold Services with Unbundled Network Elements and Combinations of Unbundled Network Elements 6.7.1 To the extent it is Technically Feasible and pursuant to the terms of Section 9.1, CLEC may Commingle Telecommunications Services purchased on a resale basis with an Unbundled Network Element or combination of Unbundled Network Elements.

  • Sub-loop Elements 2.8.1 Where facilities permit, BellSouth shall offer access to its Unbundled Sub-Loop (USL) elements as specified herein.

Time is Money Join Law Insider Premium to draft better contracts faster.