topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday April 26, 2024, 5:38 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - kalos [ switch to compact view ]

Pages: prev1 2 3 4 5 6 [7] 8 9 10 11 12 ... 73next
151
Hello!

My data has several client names. It is very often that there are several names of the group of one client.

For example, Batavia Insurance, Batavia Fund, etc probably belong to the Batavia group.

It would mostly work if I match the first word, so how could I write an Excel function to group/count entries that their cell has the same first word?

But it is not 100% safe, eg when you have PT Batavia Company and a prefix kills your algo.

I know it will never be 100% accurate, but do you have any idea how to approach this?

Thanks!

152
Living Room / Re: How to model this?
« on: September 26, 2018, 03:15 AM »
Oh damn, I forgot I am talking about EXCEL lol

153
Living Room / How to model this?
« on: September 24, 2018, 04:56 PM »
Hello,

I want to model the processing of some cases, so I know the process time of each case and the number of employees, so I can find the end date.

However, depending on the deadline, the cases may need to be reprocessed every a fixed number of months, so that the total number of cases may increase depending on the end date.

How can I model this? I find difficulty because the number of cases to be reprocessed affects the end date, but also the end date affects the number of cases to be reprocessed!

Any idea?

Thanks

154
What's the difference?

You either have 3 lines that say:

3  Product1
1  Product2
1  Product3

Or three files that contain lines that say:

File "Product1.txt"
Product1
Product1
Product1

File "Product2.txt"
Product2

File "Product3.txt"
Product3

Either way all you're getting is a count of how many times a match appears.

No it's not the same, because the regex will be different! And I want to store the whole regex match in the file, which will be huge multiline text!

155
gci FILEPATH\out.txt|group|select Count,Name >FILEPATH\out-counted.txt

No you misunderstood. I don't want to count matches. I want to group them and output them in a separate file.

For example, I will search for my regex:
<html:producttype>(.+?)</html:producttype>
The possible matches will be:
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product2</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product3</html:producttype>
etc

I want the script to create one file with the matches where the (.+?) is the same, so:
1 file that contains:
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
1 file that contains:
<html:producttype>Product2</html:producttype>
and 1 file that contains:
<html:producttype>Product3</html:producttype>

Thanks!

156
Guys, after I search for regex matches in a text, how can I group the matches to separate files, by same reference inside the regex match?

For example, for every regex match <html:producttype>(.+?)</html:producttype>, I want to output to a separate file all the matches where the (.+?) is the same.

Any idea? Also, please explain the strategy/pseudocode to see how that would work.

157
General Software Discussion / Re: Big Data tools
« on: September 14, 2018, 12:36 PM »
Is there a 'free for commercial use' software that opens and processes with regex large text files of 20GB?

TL;DR; Yes.

Suggestions:
  • Any Linux distro with a non-commercial license (most will fit)
  • Linux-tools for Windows (if it's supposed to run on Windows, you didn't say it should)
In any of these environments use tools like grep, sed, awk, perl, python etc. for text processing.

  • 'Plain' Windows (7 and newer)[/i]
Use powershell, like shown & explained in your other thread.


Thanks but I want to also be able to view the content, like EmEditor

158
Living Room / Re: Looking for smartphone
« on: September 14, 2018, 12:35 PM »
What are the top 3-5 cheapest mobiles with NFC and >5000mah?
Any idea?

Any suggestion for the best phone with longest battery and NFC?

159
General Software Discussion / Big Data tools
« on: September 14, 2018, 09:47 AM »
Hello!
Is there a 'free for commercial use' software that opens and processes with regex large text files of 20GB?
Thanks!

160
What could be the problem?
You haven't shared the file, so we'll never know, unless...

I made it work like that:
gci FILEPATH | sls -AllMatches '<html:productType>(.+?)<\/html:productType>' | % { $_.Matches } | % { $_.Groups[1].Value } >> FILEPATH\out.txt

But I don't know how I made it work lol, can you spot the error? Also, I know I asked before, but can you point me to somewhere that explains % { $_.Matches } | % { $_.Groups[1].Value } ?
I think % means 'for every' and $_.Matches is the object variable of the matches, while $_.Groups[1].Value is the content value of the matches objects, right? But what is [1]?

UPDATE: it seems both work, but which would be better?
Thanks!

161
Any hint?
We've been here before: https://www.donation....msg422274#msg422274

Ah great thanks!

I tested it and there is an issue. I searched in the file and there is only one instance of <html:productType>(.+?)</html:productType>
However, the output file mentioned the above value (.+?) twice. What could be the problem?

Thanks!

gci C:\XML.xml | % { sls $_.Name -Pattern '<html:productType>(.+?)<\/html:productType>' -a | % { $_.Matches } | % { $_.Groups[1].Value } >> C:\out.txt }

162
Guys, can anyone tell me the command that will find all the regex matches, isolate a specific part of each regex match and output all of them in a file?

I have the regex, but I don't know how to indicate a part in it.
The regex is this: "<html:productType>(.+?)</html:productType>"
I used the parentheses to isolate the part of the regex that I want to be output in the file.

How the whole command should be?

I found online and wrote this:
[regex]::match($s,"<html:productType>(.+?)</html:productType>").Groups[1].Value
But I don't know where you specify the source text or if it is correct. Any hint?

Thanks!

PS: It is really a nightmare to do some simple stuff in Powershell. There is very poor and incomplete documentation. Do you think there could be any other solution? Python maybe or anything else? I need it to work with big data though and if it has GUI it would be nice. Also, it needs to be free for commercial and any use.


Anyone please?

163
Guys, can anyone tell me the command that will find all the regex matches, isolate a specific part of each regex match and output all of them in a file?

I have the regex, but I don't know how to indicate a part in it.
The regex is this: "<html:productType>(.+?)</html:productType>"
I used the parentheses to isolate the part of the regex that I want to be output in the file.

How the whole command should be?

I found online and wrote this:
[regex]::match($s,"<html:productType>(.+?)</html:productType>").Groups[1].Value
But I don't know where you specify the source text or if it is correct. Any hint?

Thanks!

PS: It is really a nightmare to do some simple stuff in Powershell. There is very poor and incomplete documentation. Do you think there could be any other solution? Python maybe or anything else? I need it to work with big data though and if it has GUI it would be nice. Also, it needs to be free for commercial and any use.

164
Living Room / Label printer
« on: September 05, 2018, 04:37 PM »
Hi!

I want a very cheap and very compact solution to print return labels when I am returning goods bought online.

Is there anything like that?

A normal printer can be cheap but it is too big. Some portable printers are more compact but very expensive for the use I want.

It doesn't have to be A4, it can much smaller!

Any idea?

Thanks!

165
Living Room / Re: Looking for smartphone
« on: August 22, 2018, 10:11 AM »
What are the top 3-5 cheapest mobiles with NFC and >5000mah?
Any idea?

166
Any idea why the below does not work?
As usual you are asking half questions without *any* documentation. And you still haven't answered all previous questions, as requested, (and even have asked new questions in the half-baked 'answer') so in my book, you're not yet ready to ask new questions.

But I need ad-hoc answers, it's not about a specific thing I try to achieve, but mostly to learn

167
Thanks, but it needs me to run it as admin, which I cannot.

Any idea why the below does not work?

(gc *.xml) -match '(?s)<\?xml\ version="1\.0"\ encoding="UTF-8"\?>.+?</dbts:PmryObj>'

168
Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?

Learn to use Powershell's built-in help system:
Code: PowerShell [Select]
  1. Get-Help about_comparison_operators

Learn to use Google:
http://lmgtfy.com/?q...er+including+newline
http://lmgtfy.com/?q...egex+match+multiline
http://bfy.tw/JV48

I tried that but it is not clear if (?s) goes to either:
at the beginning of the regex and after the '
at the beginning of the regex and before the '
just before .

Any idea?

169
but I cannot see in the list of operators the OR  :tellme:

Code: PowerShell [Select]
  1. Get-Help about_Logical_Operators

I did that, but I get this:

PS H:\> Get-Help about_Logical_Operators
Get-Help : Get-Help could not find about_Logical_Operators in a help file in this session. To download updated help
topics type: "Update-Help". To get help online, search for the help topic in the TechNet library at
http://go.microsoft.com/fwlink/?LinkID=107116.
At line:1 char:1
+ Get-Help about_Logical_Operators
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ResourceUnavailable: (:) [Get-Help], HelpNotFoundException
    + FullyQualifiedErrorId : HelpNotFound,Microsoft.PowerShell.Commands.GetHelpCommand

170
Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?

Learn to use Powershell's built-in help system:
Code: PowerShell [Select]
  1. Get-Help about_comparison_operators

Learn to use Google:
http://lmgtfy.com/?q...er+including+newline
http://lmgtfy.com/?q...egex+match+multiline
http://bfy.tw/JV48

 :up: but I cannot see in the list of operators the OR  :tellme:

171
How can I do that?
That's why we asked more specific questions, but you never answered them.
So then I gave you the assignment of answering all our unanswered questions, but you haven't done that up until now, so basically, we are waiting (but not holding our breath) for your answers, before accepting new questions. :(


OK I start again:

That finally makes some sense. Here is an example solution for putting that into a .csv formatted file.

The problem with that is that I do not always know the node tree hierarchy and also it may change per record! That's why I cannot use the node tree hierarchy to extract a value, but I can use a guess of it, if that helps, eg //NODE1/*/NODE3/ ?

It doesn't even look a teensy bit like this new data you've given just now, are you playing us?

It looks the same to me??? Only the attribute names and values change. But again, the records do not contain the same attributes and in the same order. There can be some basic rules that all records follow, but unfortunately the data structure is not consistent, that's why I want to use regex, to include some fuzziness in matching!

Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?

172
I don't understand why you do not answer my specific questions, regardless of the source data format and the desired output. Is what I am asking not possible to be done with Powershell?

For example, I want to perform a regex match that will output all matches of regex1 and regex2 and regex3.

How can I do that?

173
And one last question... when you say duplicate, you mean the whole record is duplicated?  Or just some of the fields, i.e. productID or prod id?

Some fields, eg there may be more than one assignedDate value, so the script will need to process these additional fields for the same prod.

The pseudocode I am looking for is like this:
1) search for the first 'prod' section of the file, convert it to single line, extract the appropriate regex (all matches) one after the other (that's why I want to specify the all the regex matches that I want the script to search for when scanning the line, as I am not sure which order they will be - it shouldn't change but just in case)
2) then find the next 'prod' section in the file, convert it to single line and put it in a line below the previous, then extract the regexes one by one

Any hint?

I tried to use ¦ to add OR regex matches, but I think it didn't work.

174
So it's always xml and it's always that schema?  And you're just worried about duplicates?


Yeah, for now it looks like that.

175
OK, so the input is:
<html:products>
    <html:prod id="prod1">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD</html:classificationType>
          <html:productType>PRD_XE</html:productType>
          <html:productId>10004</html:productId>
          <html:assignedDate>2018-07-23</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS</html:name>
          <html:Entity>REP_XE</html:legalEntity>
          <html:location>ED</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>

The above continues to prod2 etc.

The output of the data would be:
prod1; PRD; PRD_XE; 10004; 2018-07-23; REPAIRS; REP_XE; ED
Then a new line would start with:
prod2; etc


However, I want to convert the input data in a string, because, I may need to match longer substrings than eg "<html:classificationType>(.+?)</html:classificationType>"
Also, I think there may be duplicates for each prod, e.g. more than one assignedDate node with different values, so MatchAll would be best.
thanks!

Pages: prev1 2 3 4 5 6 [7] 8 9 10 11 12 ... 73next