Finding duplicates in the EPDM vault

Hi all

I am addressing you with a question related to Enterprise PDM.

In the company where I have been working for 1 year, Enterpise PDM was implemented 5 years ago without any training or awareness being done among the users. Everyone had to manage to learn how to use PDM.

Today, we find ourselves with a safe that fills up cruelly and slows down just as much. The filling is partly related to the possibility of saving duplicates in the vault. These duplicate parts are mainly 3D of commercial parts which, by a bad habit (in my opinion) are stored in the business folders rather than a shared library... [PERSONAL REFLECTION] I didn't imagine that we could still work like this! [/PERSONAL REFLECTION]

My question is as follows:

I would like to list all the duplicates. Is there a tool, a method or what that would allow me to list all the duplicates (identical names and/or file types ), their location in the vault and the assemblies in which they are used?

Thank you in advance for your answers.

Hello

Natively, apart from via the search form, there is no direct way to detect duplicates. So listing their use case is also done in this interface.

To automate the search and analysis, it is possible to use excel macros, but the most difficult thing is to find the criteria that will identify duplicates.

If the file name is identical, it's easy, but then if you have to play with the properties of the files to find similarities, it becomes a little more complex.

For the answer to the personal reflection, this is unfortunately the case in a lot of companies (mainly SME profile).

Thank you for your answer Cyril.f!

If I understand correctly, apart from being a computer scientist who is very skilled with Excel macros to snoop in a database that is not on Excel, I just have to search the 514GB of the safe?

For pity's sake, tell me no! Laughing out loud

It's not very complicated, I can provide you with snippets of code for analyzing the use of duplicate files.

On the other hand, unfortunately there is nothing a posteriori that allows us to check for duplicates. The only configuration for future files is to avoid duplicate names in the entire database, it can be set with the options on file types in the administration interface. However, this does not prevent a clever person from registering a duplicate with another name.

The other way is to prohibit the copying of certain files/folders when using the tree copy functions in the vault but on the other hand this setting does not apply if someone goes through the composition to take with you in Solidworks (I asked Solidworks for an evolution for this subject).

You can always try with your reseller to see if he doesn't have a tool available to do the job but I'm afraid it's up to you to make your piece of code (I'm willing to help if needed) because in general it's highly dependent on the "customer" (in this case your company).

Thank you again Cyril...

I do want your pieces of code to try to get by and I may call on you if I really need it and if a little time is really allocated to me to do this job... Thank you very much for your proposal in any case.

In any case, when analyzing the situation, I have the impression that I am facing a "human" problem which lies in the reluctance to change one's work habits and to have to rack one's brains a little if one is confronted with the fact that a file that one wants to save in the vault already exists under the same name... I have already tried to check the box prohibiting duplicates in the administrator and the situations quickly presented themselves!! In view of the permanent urgency to move forward with the projects, I had to backtrack...

I think it's a wasted effort! :-(

Re

Below is the code snippet:

Dim i As Long
Dim j As Long
Dim k As Long
Dim vault               As EdmVault5
Dim folder              As IEdmFolder6
Dim varEnum             As IEdmEnumeratorVariable5
Dim valueRes            As Variant
Dim file                As IEdmFile6
Dim ref                 As IEdmReference6
Dim pos                 As IEdmPos5

Sub AnalyseHisto()
i = 2
j = 2
k = 2

Set vault = New EdmVault5
vault.LoginAuto "xxx", 0 'Remplacer xxx par le nom du coffre sans le C:\
Do While Worksheets("Feuil1").Cells(i, 2) <> ""
    If Worksheets("Feuil1").Cells(i, 3) = "" Then i = i + 1
    Worksheets("Feuil2").Cells(k, 1) = Worksheets("Feuil1").Cells(i, 1)
    Worksheets("Feuil2").Cells(k, 2) = Worksheets("Feuil1").Cells(i, 2)
    Worksheets("Feuil2").Cells(k, 3) = Worksheets("Feuil1").Cells(i, 3)
    Set folder = vault.GetFolderFromPath(Worksheets("Feuil2").Cells(k, 1))
    Set file = vault.GetFileFromPath(Worksheets("Feuil2").Cells(k, 1) & Worksheets("Feuil2").Cells(k, 2))
    Set ref = file.GetReferenceTree(folder.ID, Worksheets("Feuil2").Cells(k, 3))
    Set pos = ref.GetFirstParentPosition(Worksheets("Feuil2").Cells(k, 3), False)
    If Not pos.IsNull Then j = k
    While Not pos.IsNull
        Set ref = ref.GetNextParent(pos)
        Worksheets("Feuil2").Cells(j, 4) = ref.Name
        Worksheets("Feuil2").Cells(j, 5) = ref.folder.LocalPath
        Worksheets("Feuil2").Cells(j, 6) = ref.VersionRef
        j = j + 1
        k = j - 1
    Wend
        k = k + 1
    i = i + 1
Loop
End Sub

To explain how it works, sheet 1 serves as a database in which I have the files I want to analyze, sheet 2 allows me to display the result of the analysis of the file's use cases.

The data are distributed as follows:

  • Column A: Full path of type C:\xxx\ (don't forget the last \)
  • Column B: Name of the file with extension
  • Column C: Version number of the scanned file (optional)

If you don't want to use the version number, you can delete the Worksheets ("Sheet2") criterion. Cells(k, 3) of Set ref and Set pos and replace with 0 it gets the latest version of the analyzed file.

 

Your answer puts me in doubt about the clarity of my request in fact (or my understanding of the answer).

I'm not looking to do a duplicate search in the list of an excel file... I'm looking to search for CAD files with the same name (that I don't know) in the vault of my Enterprise PDM (linked to Solidworks).

Is that how you understood my question Cyril.f?

Hello

You can use the Epdm search tool to launch your search on all the files whose name includes the text .sld then you export this result in csv to finally open it in Excel.

http://help.solidworks.com/2016/English/EnterprisePDM/FileExplorer/t_Searching_in_Vaults_search_tool.htm

Kind regards

2 Likes

Indeed I omitted the phase you search with the search tool and you export the list.

Otherwise, it is also possible to automate the search and compare the names of the results.

Hello, I have a question that bothers me. Once you find the duplicates, you'll delete them! Yes??? And when you open a set that used one of the deleted duplicates, you should put the path back in each of them!!!  

I think it's best to leave what is done but from now on you have to check the option "impossible to archive duplicates" (or equivalent).

As a result, you don't touch the old assemblies and you "restore" the problem.

 

Cdt

G.

2 Likes

@G. : With us, once a duplicate is detected, we rename it to xxx-doublon so as not to blow up the old assembly versions, but we also make sure that the file is no longer usable so that we force users to update the file with the right link (when you end up with a text instead of a screw, it makes you move in general). Since our users tend not to pay attention to whether they load the latest version or not, so it works pretty well.

@JMo: For the search via macro and analysis you can base yourself on this:

Sub Main()
Const cCoffre = "C:\xxx\"
Set vault = New EdmVault5
vault.LoginAuto "xxx", 0
Set folder = vault.GetFolderFromPath(cCoffre)
i = 2
Call Search_File
Call Traitement
End Sub

Sub Search_File()
Set Search = vault.CreateSearch
Search.StartFolderID = folder.ID
Search.FindFolders = False
Search.filename = "" 'Ici, renseigner soit une extension particulière soit rien mais du coup la recherche va sortir tout le contenu du coffre puisque le dossier correspond à la racine (à chnager dans Set folder du module main)
End Sub

Sub Traitement()
Set result = Search.GetFirstResult
While Not result Is Nothing
    Set file = vault.GetFileFromPath(result.Path)
    Cells(i, 1) = Left(result.Path, InStrRev(result.Path, "\"))
    Cells(i, 2) = file.Name
    Set result = Search.GetNextResult
    i = i+1
    DoEvents
Wend

End Sub

  As it stands, for each file found, the macro writes in column A the folder, in column B the file name with extension.

You can store the filename results in an array variable and then do some processing to find the duplicates. Otherwise formulas in excel may be enough. If you have to go through other analyses to identify duplicates (such as a reference or a perfectly identical designation) you will have to go through an extraction of the files, recovery of the properties of card variables and then comparative analysis.

1 Like

Thank you all for your answers.

I should be able to handle all of this.

 

@G.: The objective of my list of duplicates with their use cases is to be able to directly target in which assemblies each duplicate is used. The idea is therefore, as far as possible, to open each assembly and redo the links and constraints with the version of the file stored correctly in the library, to then delete the duplicate. Of course, pragmatism is required since it would be completely crazy to waste time making these changes if I find 300 use cases for example! ;-)

1 Like

Hello

We too are confronted with this problem of duplication mainly with commercial pieces that we call bookcases.

Our trunk is also 5 years old and it's true that it fills up super quickly. I imposed rules, which are not always well accepted. But I think it's important to have a common methodology to move forward. The criticism must be constructive but it must be in 2 directions between the designers and the admin.

I agree with G. and Cyril.f, it is essential to set up the option to refuse the archiving of duplicates.

A procedure that you can set up and that works very well with us, is when you encounter a duplicate:

  • Identify the duplicate file and the official file. (there are often sacrifices to be made, and this is hard to hear for the designer who archived the duplicate) either by anteriority or by the most used or the validated....
  • Change state -> OBSOLETE (only the LIBRARIAN group)
  • Rename the file with the addition of a prefix -> Deprecated - .........
  • Empty data cards to " forget " it (avoid seeing it appear in a search)
  • And put it in red so that it is visible in the assembly to possibly replace it when you copy a case to replace it at that time.

You should never delete a file because you will get into big problems with your colleagues... ;)

My advice, leave the past as it is and focus on new business. The beginnings will be hard but with time you will gain in production. CHEER UP.

Today we want to standardize our blueprint parts. And I'm looking for a way to warn the designer that he's designing a duplicate. I know that there are tools that analyze the geometry of 3D. Have you already implemented this kind of tool ?

A+